Skip to content

Commit 4659795

Browse files
author
rich
committed
Assembly code.
Performance code (start of). More documentation fixes / clarifications.
1 parent 83c6612 commit 4659795

File tree

6 files changed

+156
-50
lines changed

6 files changed

+156
-50
lines changed

.cvsignore

+2-1
Original file line numberDiff line numberDiff line change
@@ -1 +1,2 @@
1-
jonesforth
1+
jonesforth
2+
perf_dupdrop

Makefile

+8-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# $Id: Makefile,v 1.6 2007-10-07 11:07:15 rich Exp $
1+
# $Id: Makefile,v 1.7 2007-10-10 13:01:05 rich Exp $
22

33
SHELL := /bin/bash
44

@@ -13,6 +13,8 @@ run:
1313
clean:
1414
rm -f jonesforth *~ core .test_*
1515

16+
# Tests.
17+
1618
TESTS := $(patsubst %.f,%.test,$(wildcard test_*.f))
1719

1820
test check: $(TESTS)
@@ -27,6 +29,11 @@ test_%.test: test_%.f jonesforth
2729
@rm -f .$@
2830
@echo "ok"
2931

32+
# Performance.
33+
34+
perf_dupdrop: perf_dupdrop.c
35+
gcc -O3 -Wall -Werror -o $@ $<
36+
3037
.SUFFIXES: .f .test
3138
.PHONY: test check
3239

jonesforth.S

+19-17
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
11
/* A sometimes minimal FORTH compiler and tutorial for Linux / i386 systems. -*- asm -*-
22
By Richard W.M. Jones <[email protected]> http://annexia.org/forth
33
This is PUBLIC DOMAIN (see public domain release statement below).
4-
$Id: jonesforth.S,v 1.42 2007-10-07 11:07:15 rich Exp $
4+
$Id: jonesforth.S,v 1.43 2007-10-10 13:01:05 rich Exp $
55

66
gcc -m32 -nostdlib -static -Wl,-Ttext,0 -Wl,--build-id=none -o jonesforth jonesforth.S
77
*/
8-
.set JONES_VERSION,42
8+
.set JONES_VERSION,43
99
/*
1010
INTRODUCTION ----------------------------------------------------------------------
1111

@@ -102,9 +102,9 @@
102102
Secondly make sure TABS are set to 8 characters. The following should be a vertical
103103
line. If not, sort out your tabs.
104104

105-
|
106-
|
107-
|
105+
|
106+
|
107+
|
108108

109109
Thirdly I assume that your screen is at least 50 characters high.
110110

@@ -151,7 +151,8 @@
151151
mov 2,%eax reads the 32 bit word from address 2 into %eax (ie. most likely a mistake)
152152

153153
(4) gas has a funky syntax for local labels, where '1f' (etc.) means label '1:' "forwards"
154-
and '1b' (etc.) means label '1:' "backwards".
154+
and '1b' (etc.) means label '1:' "backwards". Notice that these labels might be mistaken
155+
for hex numbers (eg. you might confuse 1b with $0x1b).
155156

156157
(5) 'ja' is "jump if above", 'jb' for "jump if below", 'je' "jump if equal" etc.
157158

@@ -269,8 +270,8 @@
269270
caches than those early computers had in total, but the execution model still has some
270271
useful properties].
271272

272-
Of course this code won't run directly any more. Instead we need to write an interpreter
273-
which takes each pair of bytes and calls it.
273+
Of course this code won't run directly on the CPU any more. Instead we need to write an
274+
interpreter which takes each set of bytes and calls it.
274275

275276
On an i386 machine it turns out that we can write this interpreter rather easily, in just
276277
two assembly instructions which turn into just 3 bytes of machine code. Let's store the
@@ -455,10 +456,10 @@
455456
Because we will need to restore the old %esi at the end of DOUBLE (this is, after all, like
456457
a function call), we will need a stack to store these "return addresses" (old values of %esi).
457458

458-
As you will have read, when reading the background documentation, FORTH has two stacks,
459-
an ordinary stack for parameters, and a return stack which is a bit more mysterious. But
460-
our return stack is just the stack I talked about in the previous paragraph, used to save
461-
%esi when calling from a FORTH word into another FORTH word.
459+
As you will have seen in the background documentation, FORTH has two stacks, an ordinary
460+
stack for parameters, and a return stack which is a bit more mysterious. But our return
461+
stack is just the stack I talked about in the previous paragraph, used to save %esi when
462+
calling from a FORTH word into another FORTH word.
462463

463464
In this FORTH, we are using the normal stack pointer (%esp) for the parameter stack.
464465
We will use the i386's "other" stack pointer (%ebp, usually called the "frame pointer")
@@ -598,6 +599,7 @@ cold_start: // High-level code without a codeword.
598599
unsure of them).
599600

600601
The long way would be:
602+
601603
.int <link to previous word>
602604
.byte 6 // len
603605
.ascii "DOUBLE" // string
@@ -661,6 +663,7 @@ name_\label :
661663
LINK in next word
662664

663665
Again, for brevity in writing the header I'm going to write an assembler macro called defcode.
666+
As with defword above, don't worry about the complicated details of the macro.
664667
*/
665668

666669
.macro defcode name, namelen, flags=0, label
@@ -783,7 +786,7 @@ code_\label : // assembler code follows
783786
NEXT
784787

785788
/*
786-
Lots of comparison operations.
789+
Lots of comparison operations like =, <, >, etc..
787790

788791
ANS FORTH says that the comparison words should return all (binary) 1's for
789792
TRUE and all 0's for FALSE. However this is a bit of a strange convention
@@ -1221,7 +1224,7 @@ var_\name :
12211224
and compiling code, we might be reading words to execute, we might be asking for the user
12221225
to type their name -- ultimately it all comes in through KEY.
12231226

1224-
The implementation of KEY uses an input buffer of a certain size (defined at the start of this
1227+
The implementation of KEY uses an input buffer of a certain size (defined at the end of this
12251228
file). It calls the Linux read(2) system call to fill this buffer and tracks its position
12261229
in the buffer using a couple of variables, and if it runs out of input buffer then it refills
12271230
it automatically. The other thing that KEY does is if it detects that stdin has closed, it
@@ -1238,7 +1241,6 @@ var_\name :
12381241
currkey (next character to read)
12391242

12401243
<---------------------- BUFFER_SIZE (4096 bytes) ---------------------->
1241-
12421244
*/
12431245

12441246
defcode "KEY",3,,KEY
@@ -1250,9 +1252,9 @@ _KEY:
12501252
cmp (bufftop),%ebx
12511253
jge 1f // exhausted the input buffer?
12521254
xor %eax,%eax
1253-
mov (%ebx),%al
1255+
mov (%ebx),%al // get next key from input buffer
12541256
inc %ebx
1255-
mov %ebx,(currkey)
1257+
mov %ebx,(currkey) // increment currkey
12561258
ret
12571259

12581260
1: // Out of input; use read(2) to fetch more input from stdin.

jonesforth.f

+89-31
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
\ A sometimes minimal FORTH compiler and tutorial for Linux / i386 systems. -*- asm -*-
33
\ By Richard W.M. Jones <[email protected]> http://annexia.org/forth
44
\ This is PUBLIC DOMAIN (see public domain release statement below).
5-
\ $Id: jonesforth.f,v 1.13 2007-10-07 11:07:15 rich Exp $
5+
\ $Id: jonesforth.f,v 1.14 2007-10-10 13:01:05 rich Exp $
66
\
77
\ The first part of this tutorial is in jonesforth.S. Get if from http://annexia.org/forth
88
\
@@ -24,9 +24,9 @@
2424
\ Secondly make sure TABS are set to 8 characters. The following should be a vertical
2525
\ line. If not, sort out your tabs.
2626
\
27-
\ |
28-
\ |
29-
\ |
27+
\ |
28+
\ |
29+
\ |
3030
\
3131
\ Thirdly I assume that your screen is at least 50 characters high.
3232
\
@@ -65,10 +65,6 @@
6565
: 2DUP OVER OVER ;
6666
: 2DROP DROP DROP ;
6767

68-
\ More standard FORTH words.
69-
: 2* 2 * ;
70-
: 2/ 2 / ;
71-
7268
\ NEGATE leaves the negative of a number on the stack.
7369
: NEGATE 0 SWAP - ;
7470

@@ -658,8 +654,9 @@ \ FORTH allows ( ... ) as comments within function definitions. This works by h
658654
want a variable which is read often, and written infrequently.
659655

660656
20 VALUE VAL creates VAL with initial value 20
661-
VAL pushes the value directly on the stack
657+
VAL pushes the value (20) directly on the stack
662658
30 TO VAL updates VAL, setting it to 30
659+
VAL pushes the value (30) directly on the stack
663660

664661
Notice that 'VAL' on its own doesn't return the address of the value, but the value itself,
665662
making values simpler and more obvious to use than variables (no indirection through '@').
@@ -833,10 +830,10 @@ \ FORTH allows ( ... ) as comments within function definitions. This works by h
833830
)
834831
: DUMP ( addr len -- )
835832
BASE @ ROT ( save the current BASE at the bottom of the stack )
836-
HEX ( and switch the hexadecimal mode )
833+
HEX ( and switch to hexadecimal mode )
837834

838835
BEGIN
839-
DUP 0> ( while len > 0 )
836+
?DUP ( while len > 0 )
840837
WHILE
841838
OVER 8 U.R ( print the address )
842839
SPACE
@@ -845,19 +842,19 @@ \ FORTH allows ( ... ) as comments within function definitions. This works by h
845842
2DUP ( addr len addr len )
846843
1- 15 AND 1+ ( addr len addr linelen )
847844
BEGIN
848-
DUP 0> ( while linelen > 0 )
845+
?DUP ( while linelen > 0 )
849846
WHILE
850847
SWAP ( addr len linelen addr )
851848
DUP C@ ( addr len linelen addr byte )
852849
2 .R SPACE ( print the byte )
853850
1+ SWAP 1- ( addr len linelen addr -- addr len addr+1 linelen-1 )
854851
REPEAT
855-
2DROP ( addr len )
852+
DROP ( addr len )
856853

857854
( print the ASCII equivalents )
858855
2DUP 1- 15 AND 1+ ( addr len addr linelen )
859856
BEGIN
860-
DUP 0> ( while linelen > 0)
857+
?DUP ( while linelen > 0)
861858
WHILE
862859
SWAP ( addr len linelen addr )
863860
DUP C@ ( addr len linelen addr byte )
@@ -868,7 +865,7 @@ \ FORTH allows ( ... ) as comments within function definitions. This works by h
868865
THEN
869866
1+ SWAP 1- ( addr len linelen addr -- addr len addr+1 linelen-1 )
870867
REPEAT
871-
2DROP ( addr len )
868+
DROP ( addr len )
872869
CR
873870

874871
DUP 1- 15 AND 1+ ( addr len linelen )
@@ -880,7 +877,7 @@ \ FORTH allows ( ... ) as comments within function definitions. This works by h
880877
SWAP ( addr-linelen len-linelen )
881878
REPEAT
882879

883-
2DROP ( restore stack )
880+
DROP ( restore stack )
884881
BASE ! ( restore saved BASE )
885882
;
886883

@@ -891,13 +888,13 @@ \ FORTH allows ( ... ) as comments within function definitions. This works by h
891888
agreed syntax for this, so I've gone for the syntax mandated by the ISO standard
892889
FORTH (ANS-FORTH).
893890

894-
( some value on the stack )
895-
CASE
896-
test1 OF ... ENDOF
897-
test2 OF ... ENDOF
898-
testn OF ... ENDOF
899-
... ( default case )
900-
ENDCASE
891+
( some value on the stack )
892+
CASE
893+
test1 OF ... ENDOF
894+
test2 OF ... ENDOF
895+
testn OF ... ENDOF
896+
... ( default case )
897+
ENDCASE
901898

902899
The CASE statement tests the value on the stack by comparing it for equality with
903900
test1, test2, ..., testn and executes the matching piece of code within OF ... ENDOF.
@@ -912,14 +909,14 @@ \ FORTH allows ( ... ) as comments within function definitions. This works by h
912909
An example (assuming that 'q', etc. are words which push the ASCII value of the letter
913910
on the stack):
914911

915-
0 VALUE QUIT
916-
0 VALUE SLEEP
917-
KEY CASE
918-
'q' OF 1 TO QUIT ENDOF
919-
's' OF 1 TO SLEEP ENDOF
920-
( default case: )
921-
." Sorry, I didn't understand key <" DUP EMIT ." >, try again." CR
922-
ENDCASE
912+
0 VALUE QUIT
913+
0 VALUE SLEEP
914+
KEY CASE
915+
'q' OF 1 TO QUIT ENDOF
916+
's' OF 1 TO SLEEP ENDOF
917+
( default case: )
918+
." Sorry, I didn't understand key <" DUP EMIT ." >, try again." CR
919+
ENDCASE
923920

924921
(In some versions of FORTH, more advanced tests are supported, such as ranges, etc.
925922
Other versions of FORTH need you to write OTHERWISE to indicate the default case.
@@ -1630,6 +1627,67 @@ EXCEPTION-MARKER, namely a function that just drops the stack frame and itself
16301627
. CR
16311628
;
16321629

1630+
(
1631+
ASSEMBLER CODE ----------------------------------------------------------------------
1632+
1633+
This is just the outline of a simple assembler, allowing you to write FORTH primitives
1634+
in assembly language.
1635+
1636+
Assembly primitives begin ': NAME' in the normal way, but are ended with ;CODE. ;CODE
1637+
updates the header so that the codeword isn't DOCOL, but points instead to the assembled
1638+
code (in the DFA part of the word).
1639+
1640+
We provide a convenience macro NEXT (you guessed the rest).
1641+
1642+
The rest consists of some immediate words which expand into machine code appended to the
1643+
definition of the word. Only a very tiny part of the i386 assembly space is covered, just
1644+
enough to write a few assembler primitives below.
1645+
)
1646+
1647+
: ;CODE IMMEDIATE
1648+
ALIGN ( machine code is assembled in bytes so isn't necessarily aligned at the end )
1649+
LATEST @ DUP
1650+
HIDDEN ( unhide the word )
1651+
DUP >DFA SWAP >CFA ! ( change the codeword to point to the data area )
1652+
[COMPILE] [ ( go back to immediate mode )
1653+
;
1654+
1655+
HEX
1656+
1657+
( Equivalent to the NEXT macro )
1658+
: NEXT IMMEDIATE AD C, FF C, 20 C, ;
1659+
1660+
( The i386 registers )
1661+
: EAX IMMEDIATE 0 ;
1662+
: ECX IMMEDIATE 1 ;
1663+
: EDX IMMEDIATE 2 ;
1664+
: EBX IMMEDIATE 3 ;
1665+
: ESP IMMEDIATE 4 ;
1666+
: EBP IMMEDIATE 5 ;
1667+
: ESI IMMEDIATE 6 ;
1668+
: EDI IMMEDIATE 7 ;
1669+
1670+
( i386 stack instructions )
1671+
: PUSH IMMEDIATE 50 + C, ;
1672+
: POP IMMEDIATE 58 + C, ;
1673+
1674+
( RDTSC instruction )
1675+
: RDTSC IMMEDIATE 0F C, 31 C, ;
1676+
1677+
DECIMAL
1678+
1679+
(
1680+
RDTSC is an assembler primitive which reads the Pentium timestamp counter (a very fine-
1681+
grained counter which counts processor clock cycles). Because the TSC is 64 bits wide
1682+
we have to push it onto the stack in two slots.
1683+
)
1684+
: RDTSC ( -- lsb msb )
1685+
RDTSC ( writes the result in %edx:%eax )
1686+
EAX PUSH ( push lsb )
1687+
EDX PUSH ( push msb )
1688+
NEXT
1689+
;CODE
1690+
16331691
(
16341692
NOTES ----------------------------------------------------------------------
16351693

perf_dupdrop.c

+33
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
/* Ideal DUP DROP * 1000 assuming perfect inlining.
2+
$Id: perf_dupdrop.c,v 1.1 2007-10-10 13:01:05 rich Exp $
3+
*/
4+
5+
#include <stdio.h>
6+
#include <stdlib.h>
7+
8+
#define DUP \
9+
asm volatile ("mov (%%esp),%%eax\n" \
10+
"\tpush %%eax" \
11+
: : : "eax")
12+
#define DROP \
13+
asm volatile ("pop %%eax" \
14+
: : : "eax")
15+
16+
#define DUPDROP DUP; DROP;
17+
#define DUPDROP10 DUPDROP DUPDROP DUPDROP DUPDROP DUPDROP DUPDROP DUPDROP DUPDROP DUPDROP DUPDROP
18+
#define DUPDROP100 DUPDROP10 DUPDROP10 DUPDROP10 DUPDROP10 DUPDROP10 DUPDROP10 DUPDROP10 DUPDROP10 DUPDROP10 DUPDROP10
19+
#define DUPDROP1000 DUPDROP100 DUPDROP100 DUPDROP100 DUPDROP100 DUPDROP100 DUPDROP100 DUPDROP100 DUPDROP100 DUPDROP100 DUPDROP100
20+
21+
int
22+
main (int argc, char *argv[])
23+
{
24+
unsigned long long start_time, end_time;
25+
26+
asm volatile ("rdtsc" : "=A" (start_time));
27+
DUPDROP1000
28+
asm volatile ("rdtsc" : "=A" (end_time));
29+
30+
printf ("%llu\n", end_time - start_time);
31+
32+
exit (0);
33+
}

perf_dupdrop.f

+5
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
( -*- text -*-
2+
FORTH repeated DUP DROP * 1000 using ordinary indirect threaded code
3+
and the assembler primitives.
4+
$Id: perf_dupdrop.f,v 1.1 2007-10-10 13:01:05 rich Exp $ )
5+

0 commit comments

Comments
 (0)