Keyword structs, revisited
This revises my Keyword structs post to fix some mistakes, discuss the struct*
match
pattern, and rewrite the macro to use syntax-parse
and support default arguments.
A good rule of thumb in Racket is to use a struct
instead of list
when you’re juggling more than two or three items.
For ad-hoc prototyping, you can use a list
:
1 2 3 4 5 6 7 8 |
Getting the stuff out is a bit cleaner using match
, which lets you “destructure” the list and bind to identifiers in one swell foop:
1 2 3 4 5 6 |
But what if you need to add or delete list members later? It’s error-prone.
That’s where a real struct
can help:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
Now let’s say you add a social security number field, ssn
:
1 2 3 |
Everything still works fine when you access the fields by-name:
1 2 3 4 5 6 7 |
(define p (get-person)) (person-first p) ; "John" (person-last p) ; "Doe" (person-age p) ; 32 |
Although if you used match
, which is by-position, that needs to be updated:
1 2 3 4 5 6 |
So you need to fix it:
1 2 3 4 5 6 7 |
struct*
This is where the struct*
match
pattern can help. By getting the fields by-name, it is insulated from the addition of new fields:
1 2 3 4 5 6 |
This needs to be updated only if/when you need the new ssn
field. So although it’s more verbose, using struct*
is more resilient.
We could reduce the verbosity, by allowing either [field pat]
or just field
— where the latter expands to use the same symbol for both the field and pattern, as we wrote out in the example above. This would be a nice enhancement to the official struct*
in racket/match
. Meanwhile here’s a struct**
match expander that wraps struct*
to do so:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
#lang racket/base (require racket/match (for-syntax racket/base syntax/parse)) (define-match-expander struct** (λ (stx) (define-syntax-class field (pattern [id:id pat:expr]) (pattern id:id #:with pat #'id)) (syntax-parse stx [(_ struct-id:id (field:field ...)) #'(struct* struct-id ([field.id field.pat] ...))]))) (module+ test (require rackunit) (struct foo (a b c)) (define x (foo 1 2 3)) (check-equal? (match x [(struct** foo (a b [c x])) (list a b x)]) x) (check-equal? (match x [(struct* foo ([a a][b b][c c])) (list a b c)]) (match x [(struct** foo (a b [c c])) (list a b c)]))) |
Making structs
Creating an instance of a struct has exactly the same form/shape as creating a list:
1 2 |
(list "John" "Doe" 32) (person "John" "Doe" 32) |
It’s just person
instead of list
. Either way, you’re specifying the fields by-position, not by-name. If you have a struct with more than a few fields:
1 |
(struct foo (a b c d e f g h)) |
Then creating the struct is itself error-prone. You will probably start jotting down comments to help you keep track of what field you’re on:
1 2 3 4 5 6 7 8 |
(foo 10 ;a "foo" ;b 13 ;c "bar" ;d "baz" ;e #f ;f "x" ;g 42) ;h |
It would help if we could turn those comments into actual keywords. Using keyword arguments is helpful for any function with more than a few arguments. We’d like to write:
1 2 3 4 5 6 7 8 |
(foo #:a 10 #:b "foo" #:c 13 #:d "bar" #:e "baz" #:f #f #:g "x" #:h 42) |
That way, Racket could help us catch mistakes. Even better, we’re free to supply the arguments in a different order, and it’s OK. It’s by-name, not by-position.
As a bonus, it would be great to have optional arguments, with a default value. (Especially since struct
s #:auto
option requires all fields to share the same default value.)
Certainly we could define a foo/keyword
function like this, which calls the plain foo
struct constructor. I’ve done this many times. Admittedly, if you change the foo
struct, you have to change this function, too. But usually they’re adjacent in the source code, and anyway it’s only the one place to make the mistake.
Even so, it would be neat if Racket had an option to create such keyword argument constructors for struct
s automatically.
A macro
Well, this is Racket. Any sentence that starts with, “It would be neat if Racket could ___”, can be answered with, “And I can add that to Racket myself!”
Here’s what we’d be writing by hand:
We’re defining a function whose name is the struct name with "/kw"
appended. For each struct field, we want a keyword argument, where the keyword is similar to the field name. Also, we’d like to support optional arguments.
So here’s a macro:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
#lang racket/base (require (for-syntax racket/base racket/list racket/syntax syntax/parse)) (begin-for-syntax (define syntax->keyword (compose1 string->keyword symbol->string syntax->datum))) (define-syntax (struct/kw stx) (define-syntax-class field (pattern id:id #:with ctor-arg #`(#,(syntax->keyword #'id) id)) (pattern [id:id default:expr] #:with ctor-arg #`(#,(syntax->keyword #'id) [id default]))) (syntax-parse stx [(_ struct-id:id (field:field ...) opt ...) (with-syntax ([ctor-id (format-id #'struct-id "~a/kw" #'struct-id)] [((ctor-arg ...) ...) #'(field.ctor-arg ...)]) ;i.e. append* #'(begin (struct struct-id (field.id ...) opt ...) (define (ctor-id ctor-arg ... ...) ;i.e. append* (struct-id field.id ...))))])) ;;; Example usage: ;; Define a struct type (struct/kw foo (a b [c 42]) #:transparent) ;; Use normal ctor (foo 1 2 3) ; => (foo 1 2 3) ;; Use keyword ctor (foo/kw #:a 1 #:b 2 #:c 3) ; => (foo 1 2 3) ;; Use keyword ctor, taking advantage of default arg for #:c field (foo/kw #:a 1 #:b 2) ; => (foo 1 2 42) |
Lines 2–6 require
some modules that aren’t part of the racket/base
environment that macros run in.
Lines 8–9 define a helper function that can be used by a macro. To do that, the function must be define
in a begin-for-syntax
form.
Line 11 onward is the macro definition.
Lines 12–16 define a syntax class to use with syntax-parse
. The class matches struct fields, which can be either an identifier alone or an [identifier default-value]
form. In both cases, the syntax class defines an extra bit of syntax, ctor-arg
. For each field, this is the arg spec to use in the definition of our special constructor function. This will be something like #:id id
in the first case or #:id [id default]
in the second case.
Lines 17–24 are the syntax-parse
form. The pattern is:
This means there will be an identifier for the struct, followed by a list of zero or more fields, and finally zero or more options.
Lines 19–20 use with-syntax
to create a couple pattern variables:
1 2 |
(with-syntax ([ctor-id (format-id #'struct-id "~a/kw" #'struct-id)] [((ctor-arg ...) ...) #'(field.ctor-arg ...)]) ;i.e. append* |
The first, ctor-id
, is simply the name of our constructor function — append /kw
to the user’s struct identifier.
The second, ctor-arg
, is our list of arg specs for the constructor function. We’ll need to append*
these — “flatten” them one level, from a list of lists into a list. That’s the reason for the funny nested ellipses: ((ctor-arg ...) ...)
— it sets us up to say ctor-arg ... ...
down on line 23.
Finally lines 21–24 are the template — the syntax we’re returning. This is simply a struct
definition plus the definition of our special constructor function. Again, the business with the double ellipses is how we append*
a list of lists like this:
1 |
'((#:a a) (#:b b) (#:c [c 42])) |
down to:
1 |
'(#:a a #:b b #:c [c 42]) |
Which is the argument list we want for our constructor.
And that’s it. Although this macro doesn’t exhaustively cover all possible struct
options, it’s an example of something you could use in a project to write code that is less repetitive and more resilient.