From: Brendan Hansen Date: Sun, 20 Sep 2020 03:42:03 +0000 (-0500) Subject: started working on a formal spec of the language X-Git-Url: https://git.brendanfh.com/?a=commitdiff_plain;h=77de57526fb6e9f794f94d7c4a15bc310aaa168f;p=onyx.git started working on a formal spec of the language --- diff --git a/docs/api_design b/docs/api_design index d6c85c8e..4ce6ebfe 100644 --- a/docs/api_design +++ b/docs/api_design @@ -16,6 +16,9 @@ The standard high level topics to cover are: - utility functions * random * clock + - arrays + - maps + - common conversions diff --git a/docs/optimizations b/docs/optimizations deleted file mode 100644 index abbe5aaf..00000000 --- a/docs/optimizations +++ /dev/null @@ -1,11 +0,0 @@ -Some optimizations to make to the output WASM: - [ ] local.set followed by local.get turns into local.tee - [ ] Dead code elimination - - Function level: If a function is not explicitly called or exported, it may be removed - - Local level: If a local is not read, it may be removed - [ ] Add compile-time evaluation for simple operations - [ ] If statement compile-time condition evaluation - - If the condition of an if statement is compile time known, only generate the case that - is used - [ ] Inline-ing functions that have been explicitly marked - diff --git a/docs/polymorphic_plan b/docs/polymorphic_plan deleted file mode 100644 index 47fdaf29..00000000 --- a/docs/polymorphic_plan +++ /dev/null @@ -1,41 +0,0 @@ -Current basic plan for polymorphic procedures - -0. Cleanup some of the aspects of the type system and make slices their own special type. - -1. Detect polymorphic parameters - Polymorphic parameters will have a $ in from a symbol in their type - -2. Polymorphic procedures (polyproc) will have a seperate entity type - Most of the stages will ignore them however - -3. Polyprocs will store a tabel on them mapping from specific, filled in types to the corresponding function. - "T=u32;R=[] u8" -> - -4. When a polyproc is called, the polymoprhic parameters are parallel-recursived to find the type matching the polyparam. - For example: - - foo :: proc (a: ^[] $T, b: u32) -> T { - return a.data[b]; - } - - arr : [128] []u8; - // init arr - a := arr[4 : 10]; - foo(^a, 2); - - When foo is called, we look at the polymorphic parameters, a in this example, are do the following recursion: - - (^[] $T, ^[] []u8) Both are pointers, so remove ^ - ([] $T, [] []u8) Both are slices, so remove [] - ($T, []u8) T is resolved to be []u8 in this case - - If at any point, both sides cannot be removed, it is an invalid parameter. - -5. When the specific types of the polyproc are resolved, if no matching function already exists, a copy is made. - Copies are made from an un-symbol-resolved version of the procedure. - Some nodes can be marked as NO_COPY which signals that a copy should not be made. - -6. After a copy is made, it is fed through symbol resolution using the correct scope, and then type checking, and then function header and function entities are added to the entity list in the correct position. - - -Ideally, nothing should change with the WASM output. diff --git a/docs/spec b/docs/spec new file mode 100644 index 00000000..8d3c2b23 --- /dev/null +++ b/docs/spec @@ -0,0 +1,80 @@ + + Onyx Language Specification + + Brendan Hansen + + This is the most formal specification of the Onyx programming language. At the time of writing, most + of the design and functionality of Onyx is entirely in my head and I am already beginning to forget some + of the features of the language. + + Some common abbrevations used in this paper: + LHS for "Left Hand Side" + RHS for "Right Hand Side" + + Language Features + Grammar.................................................................Line xxx + Operator Precedence.....................................................[op-pred] + Operators...............................................................Line xxx + Arithmetic Operators................................................Line xxx + Logical Operators...................................................[log-ops] + Bitwise Operators...................................................Line xxx + Miscellaneous Operators.............................................Line xxx + Packages................................................................Line xxx + + + + Operator Precendence + [op-pred] + + The following are all the binary operators in Onyx in their precedence from highest to lowest: + + +---+-------------------------------------------+ + | 9 | % | + | 8 | * / | + | 7 | + - | + | 6 | & | ^ << >> >>> | + | 5 | <= < >= > | + | 4 | == != | + | 3 | && || ^^ | + | 2 | |> .. | + | 1 | = += -= *= /= %= &= |= ^= <<= >>= >>> | + +---+-------------------------------------------+ + + + The following are all the unary operators. Prefix unary operators are right associative and all + of equal precedence. Postfix unary operators are left associative and also all of equal precedence. + + Prefix unary operators: + ! Boolean NOT + - Arithmetic negative + ~ Bitwise NOT + ^ Address of + * Dereference + cast(X) Cast to X + ~~ Auto cast + + Postfix unary operators: + [N] Array subscript + .foo Field access + (..) Function call + + + + + + Logical Operators + [log-ops] + + There is only one unary logical operator, !. It negates the boolean value given to it. + + Binary logical operators expect both the LHS and RHS to be of boolean type. The following are the logical + operators in Onyx: + + && Logical AND + || Logical OR + ^^ Logical XOR + + All logical operators are of equal precedence. && and || are standard operators in many C-style languages. + ^^ is not found in any of the languages I know about, but I don't see a reason not to have it. There are a few + places in the compiler that could utilize the ^^ operator. + diff --git a/docs/thoughts b/docs/thoughts deleted file mode 100644 index 13703fa4..00000000 --- a/docs/thoughts +++ /dev/null @@ -1,105 +0,0 @@ -Memory design: - - Pointers will work very similar to how they do in C - - A pointer is a u32 - - Pointers will be notated: - ^u32 <- Pointer to u32 - - - Pointer operations will be: - * will take the address of a value - - This operation will not be defined well for a while - - You can't take the address of a local since it doesn't exist in memory - - << will take the value out of a pointer - - Example use: - {{{ - ptr: ^i32 = 0; // Address starting at 0 - ptr_ptr := *ptr; - }}} - - - - - -Treating top level declarations differently: - Currently, top level declarations are treated special, as they would correspond to - the structure of the WASM that would be generated. For example, - - inc :: proc (a: i32) -> i32 { return a + 1; } - - would be turned into an AstFunction node with a token of 'inc', and, - - global :: 5 - - would be turned into a AstGlobal node with a token of 'global'. - - The problem I have with this approach is it creates an inconsistency when thinking - about what is going on in the various stages in the compiler. - - A better approach would be to have a AstBinding node, that represents a binding - from a symbol, stored on the token member, to another Ast Node. The node definition - would be: - - struct AstBinding { AstTyped base; AstNode* node; } - - For a function definition such as 'inc' above, the node structure would look like: - - AstBinding (inc) - .node -> AstFunction - .params -> AstLocal (a) -> NULL - .body -> ... - - This way, in symbol resolution, the top level bindings are added to the table and - there are no special cases. - - Other nuances: - - global :: 5 - This would replace all instances of 'global' with the integer constant - 5. This would not make a global in WASM. - - global :: i32 - This would work as a type alias. 'global' would have the type node as it's 'node' - - print :: proc #foriegn "host" "print" (...) --- - - -Explicit overriden functions: - Considered syntax: - - foo_i32 :: proc (val: i32) -> i32 --- - foo_i64 :: proc (val: i64) -> i64 --- - foo_f32 :: proc (val: f32) -> f32 --- - foo_f64 :: proc (val: f64) -> f64 --- - - foo :: proc #overload { - foo_i32, foo_i64, foo_f32, foo_f64 - } - - foo(10); // calls foo_i32 - foo(2.0f); // calls foo_f32 - - - min_f32 :: proc #intrinsic (a: f32, b:f32) -> f32 --- - min_f64 :: proc #intrinsic (a: f64, b:f64) -> f64 --- - - min_i32 :: proc (a: i32, b: i32) -> i32 { - least := a; - if b < a { least = b; } - - return least; - } - - min_i64 :: proc (a: i64, b: i64) -> i64 { - least := a; - if b < a { least = b; } - - return least; - } - - min :: proc #overload { min_i32, min_i64, min_f32, min_f64 } - - min(2, 5); - min(4.5, 10.4); - -Some WASM intruction optimizations: - - local.set followed by local.get -> local.tee - - Any code after unconditional break before 'end' can be removed