While it’s great that our language is Turing-complete and all, it’s still
pretty useless if it can’t interact with the outside world. So we’re going to
add four primitive functions to our language: getchar
, print
, itoc
, and
cat
. The first is probably pretty familiar from C, the second is probably
familiar from Python, the third one is a short name for “int to char”, and the
last one concatenates two strings. itoc
allows the programmer to work with
arbitrary ASCII values and make characters such as newline (10), space (32),
etc, and cat
allows the programmer to create symbols character-by-character.
Why have print
instead of putchar
? Well, we don’t have a way to manipulate
symbols on a character level right now, so we might as well be able to print
whole objects.
We’re also going to add a built-in value to the basis, empty-symbol
. Right
now there is no way that I know of to build an empty symbol. That is, there is
no way of reading something whose value is (Symbol "")
. In order to write a
some functions easily, that will be a necessity.
let basis =
[...]
let prim_getchar = function
| [] ->
(try Fixnum (int_of_char @@ input_char stdin)
with End_of_file -> Fixnum (-1))
| _ -> raise (TypeError "(getchar)")
in
let prim_print = function
| [v] -> let () = print_string @@ string_val v in Symbol "ok"
| _ -> raise (TypeError "(print val)")
in
let prim_itoc = function
| [Fixnum i] -> Symbol (stringOfChar @@ char_of_int i)
| _ -> raise (TypeError "(itoc int)")
in
let prim_cat = function
| [Symbol a; Symbol b] -> Symbol (a ^ b)
| _ -> raise (TypeError "(cat sym sym)")
in
let newprim acc (name, func) =
bind (name, Primitive(name, func), acc)
in
List.fold_left newprim ["empty-symbol", ref (Some (Symbol ""))] [
[...]
("getchar", prim_getchar);
("print", prim_print);
("itoc", prim_itoc);
("cat", prim_cat);
]
getchar
expects no arguments and returns a character from standard input.
Since we have no way of detecting exceptions raised, we’ll return -1
if we
encounter EOF
. This is useful because -1 is not in the ASCII table and
therefore corresponds to no valid character. Notice that getchar
returns a
Fixnum
and therefore has to be converted with itoc
by the programmer.
print
prints the string_val
representation of a value and returns the
symbol ok
. Nothing spcial.
itoc
converts the given number into a character, makes a string of length 1,
and then returns a symbol. Nothing fancy.
You’ll notice that getchar
and itoc
use this weird operator @@
. Turns out
it’s actually a pretty handy operator. Because of its precedence, it allows for
chaining of functions, and they are applied in sequence. So it makes the
following two lines equivalent:
int_of_char @@ input_char stdin
int_of_char (input_char stdin)
Yeah, it’s one character longer, but it removes parentheses.
empty-symbol
is defined as expected.
There’s one really annoying problem with these functions, though. If you stopped reading and went ahead and implemented them, you’d probably see behavior like:
$ ocaml 15_io.ml
> (getchar)
10
> (itoc (getchar))
>
$
What gives? It’s not even waiting for me to type something — it just takes the newline!
That’s for two reasons:
So when I go to hit enter, the reader goes all the way until the last parenthesis, then stops. And then the next character is the newline, 10.
The easy fix, as it turns out, is to add the ability to read from a file instead of only supporting a REPL. We’ll make a better reader in the next post, so hold off on line reading for a second.
Let’s go back to main
:
let get_ic () =
try open_in Sys.argv.(1)
with Invalid_argument s -> stdin
let main =
let ic = get_ic () in
let stm = { chr=[]; line_num=1; chan=ic } in
try repl stm basis
with End_of_file -> if ic <> stdin then close_in ic
This is some pretty new stuff. get_ic
means “get input channel”. It tries to
get the command-line argument to the interpeter and open it as an input
channel. open_in
is analogous to fopen
from C.
Note that it gets index 1 instead of index 0, since 15_io.ml
is in index 0:
index: 0 1
$ ocaml 15_io.ml mylispprogram.lsp
Once it has the channel, it makes our nice little stream and calls repl
. If
that has an End_of_file
exception, which happens if we hit Control-D in the
REPL or the program reaches the end of our input file, it cleans up after
itself and exits.
Note that we only want to close the input file if it’s not stdin
(the <>
operator), otherwise we can’t read anything else after.
We should probably modify repl
so that it doesn’t print a prompt if it’s
reading from a file that the user passed in:
let rec repl stm env =
if stm.chan=stdin then ( print_string "> "; flush stdout; );
let ast = build_ast (read_sexp stm) in
let (result, env') = eval ast env in
if stm.chan=stdin then print_endline (string_val result);
repl stm env';;
We also probably don’t want to print the result of every intermediate expression in the user’s program, so we limit that to REPL-only as well.
Let’s take it for a spin:
$ cat > myprogram.lsp
(print (itoc (getchar)))
(print (itoc 10))
(print (cat 'hello (cat (itoc 32) 'world)))
$ ocaml 15_io.ml myprogram.lsp
a
a
hello world
$
Nice.
Download the code here if you want to mess with it.
There are some I/O functions that would be really useful to have in user-define programs, but that we don’t want to bake right into the OCaml code. This is where having a standard library would come in super handy, so that’s up next!