EK9 Basics
Please take a look at the introduction if you are looking to understand what EK9 is. Review structure for an overview of the file structure and the constructs.
This section is mainly about layout, declarations, comments and visibility.
Getting Started
- Indentation
- Line Termination
- Parameter Passing
- Operators
- Objects and no Primitives
- No auto boxing (and why)
- Declarations
- Variable Declarations
- Different assignment symbols (and why)
- Type Inference
- Checking if a variable has a value!
- Assignment
- One Line Comments
- Block Comments
- Documentation Comments
- Pure and re-assignment
- Visibility
- Module Visibility
- Function Visibility
- Record Visibility
- Class Visibility
- Trait Visibility
- Component Visibility
- Parameter access
Symbols and Punctuation
As mentioned in the introduction EK9 is indentation based; rather than using '{' and '}' symbols in combination. The '{' and '}' symbols are used for Dictionary (Map) initialisation.
Line Termination
The ; (semi-colon) is not used anywhere at all; not even as a line terminator.
The '<' and '>' in combination are only used for comments and not for any sort of generics/template declarations.
Importantly there are no in-line lambdas as such, so symbol combinations of → and => are not used in that context.
Accepting Parameters
The → symbol is used to denote incoming parameters and the ← is used to mean 'coming
out of'/'returning a value' or defining, inferring and initialising a constant/variable.
See declarations and type inference.
When calling methods or functions parameters are passed using '(' and ')' (parentheses) as with
most programming languages. It is only where parameters are accepted or returned in methods or
functions that → and ← are employed.
Parameter passing examples.
#!ek9 defines module introduction defines function function1() -> n Integer <- sum Integer: 0 ... function2() -> n1 Integer n2 Integer <- sum Integer: 0 ... //EOF
In general if passing in one parameter the single line input (→ n Integer) is preferred (see function1 above). Ff there are two or more parameters (see function2 above) then they → should appear on a line by itself and the parameters should appear on consecutive lines but indented.
While it is not mandatory; the returning parameter (if any) should follow the same layout as the incoming
parameter(s), i.e. single line (function1) or indented form (function2).
There can only be zero or one named return parameter multiple return values are not supported.
Though as all parameters are passed by reference it is possible to alter the value of in coming parameters by
'copying' data into them. This is a little like 'inout' parameters in some languages.
This mechanism of parameter declaration has been taken to force a less dense text approach. It also encourages fewer parameters to be employed.
Operators
EK9 uses a range of symbols and symbol combinations for operators. This is intended to drive consistency (use long term memory for ideograms) in common areas such as addition of two items and also addition of an item in to a collection.
This is more in line with the C++ and Scala languages;
it is a move away from just adding various and inconsistent method names with very different semantics
to Object types. Care has to be taken when overriding and using operators as excessive or misleading/
inappropriate use can lead to confusion and code that is hard to understand/maintain.
EK9 enforces both immutability and mutation semantics on operators.
Objects
There are no primitive types in EK9 - this means that everything is an Object. This is quite
an important point because most languages do have primitive types and so when values are passed to
methods or functions they are passed by value. In EK9 all values are effectively
passed by reference.
But note EK9 adds the modifier pure which means that mutable data structures can be
passed by reference into methods and functions; but when these are marked with the pure
modifier even mutable data cannot be modified.
So what does this mean? A primitive int or float in most languages always has a value even if it just declared without initialisation (0 and 0.0f respectively). An amount of memory has been allocated to the variable for it to store the value.
So this means that a primitive always has memory allocated and there can never be a null pointer issue then? Well you might think that but with languages that support auto boxing (see here for what this means) errors can still occur where the Object form of a value is auto boxed into a primitive.
Auto-Boxing (not supported)
- //If auto boxing was supported
- //primitiveInt would be initialised as 0
- primitiveInt as int
- //objectInt would not have any memory allocated to hold its value
- objectInt as Integer
- //So what happens here?
- primitiveInt = objectInt
- //Some type of null exception will occur
Consistency
In general when working in an Object-Oriented or Functional Programming model it's best to be consistent. For example all the EK9 collection types work with any type of Object and even functions can be passed as Objects (delegates). This is the main reason EK9 does not have primitives - therefore no auto boxing is ever used and therefore that sort of null pointer error can never occur.
The downside is that Objects are much more heavy weight than primitive types, that could easily be implemented in a CPU register. This just means that the code generation phase has to work much harder to produce efficient executable instructions. But from a programmer/developer point of view; there is consistency.
Just think in terms of Objects and not low level primitives. There are no arrays but there are Lists and Dictionaries and to access the contents of these you can use Iterators but also Stream pipelines as outlined in the introduction.
Declarations
The standard approach in EK9 to declare a variable or a constant is to always allocate it some memory so that it always "points somewhere'. This means that a variable (like an Integer) can be declared; but importantly it can be declared as not having any meaningful value.
This may seem like a trivial, unimportant or even a nonsensical thing to do, why declare something with no meaningful value? See the example below.
Variable declarations
- //Declare an Integer - but with no value yet known.
- age ← Integer()
- //Declare an Integer - but with a known value from the outset.
- minimumAge ← 21
In the above example a variable of type Integer has been declared the variable is called age.
If we assume the program we are about to write is going to ask the user for their age; then at this point
in time we don't yet know what that value will be.
Normally you would have to use a value like -1 or some other value to indicate that
the value had not yet been set, or if you were to use an Object version, you might have left it as
null.
With EK9 there is no need for this. Just declare age as above, it has memory allocated but is noted as having no meaningful value. The variable minimumAge is declared and is also known from the outset, importantly the type of minimumAge has been inferred as an Integer.
It is also possible to declare the same value as below, if you prefer. In this case you are being explicit as to what the type is. This becomes important when you want a variable to be of a specific type (typically in class, component, trait or function hierarchies).
- //Declare an Integer - but with no value yet known.
- age as Integer: Integer()
- //Or like this.
- age as Integer = Integer()
- //Or even like this.
- age as Integer := Integer()
- //Declare an Integer - without type inference
- minimumAge as Integer := 21
It is possible to declare variables like this below, this sort of syntax should really only be used when
receiving parameters on methods/functions, it is best avoided when declaring local variables,
fields/properties on records, classes and components
as it is very likely to lead to some sort of error. An exception would be when you are dealing with
an abstract type/trait; as you cannot instantiate an abstract type/trait/class.
EK9 enforces the ?suffix for this type of declaration and also checks if the variable is assigned
before use.
- //Declare an Integer - but no space allocated
- possibleError as Integer?
Why use different ways to declare a variable/parameter you may ask? In short; it is a mix of history, a desire for type inference and polymorphic variables. Always prefer the following forms where possible.
- //Local variable declaration and initialisation.
- variableName ← TypeConstructor()
- //polymorphic variable declaration and initialisation.
- variableName as SuperType: SubTypeConstructor()
- //Preferred field/property declaration and initialisation.
- fieldName as Type: TypeConstructor()
History
There are programming constructs and technologies such as CSS, JSON where the ':' is used extensively for assignment when data elements are initialised. Not only does this look very clear on the eye it also means that many programmers have become used to this syntax.
However HTML uses the '=' for assignment to properties in UI elements and most programming languages use '=' for assignment of variables, so again there will be many programmers that are used to using '=' for assignment.
So why add ':=' as well? Both Pascal and ADA use ':=' for assignment (as well as other languages) but more importantly it brings the ability to drive for consistency with other operators listed below. Note that ':=' also looks and feels stronger as an assignment than either ':' or '=' separately. Use the assignment operator you feel most happy with; but try to be consistent.
Other types of assignment and equality checks:
- ?= The guarded assignment
- == Test for equality
- <= Less than or equal
- >= Greater than or equal
- <> Not equal to (like SQL not equals)
- != Also not equals to (like C, C++, Javascript not equals)
- += Add value and assign to
- -= Subtract value and assign to
- *= Multiply value and assign to
- /= Divide by value and assign to
There are more operators available but those above are involve reference to = either in the form of assignment or in the form of testing equality. So adding in := does bring some form of two character consistency(in a way). EK9 does provide a range of coalescing operators which work well with the 'is set' nature of EK9 variables.
Type Inference
The use of ← is the main mechanism to not only declare a variable; but also have its type inferred. As you can see in the example below, the inference mechanism is much more terse, easy to write and understand.
- minimumAge ← 21
- //Alternatively - which can be useful
- minimumAge as Integer: 21
So EK9 uses type inference for the consumer of methods, functions and fields but forces the writer of the records, classes, components, methods and functions to fully declare the fields and parameters. This makes it much quicker and easier for the programmer to determine the types they are dealing with. They also get the speed and clarity of type inference when writing bodies of code, which is where the bulk of code is written.
Why not just use type inference everywhere?
It was noted when looking at type inference; that while it is possible with a Hindley - Milner type inference mechanism to deduce and infer all the types; developers actually needed to quickly see what type they were dealing with. Either in the definition of structured constructs or in the use of such constructs. While modern IDE's can really help in this regard (by showing types via hovers for example) it became very obvious that it was much quicker and easier for a developer to just follow the method call and immediately see what the type was.
Checking variables
Clearly if it is possible to create/declare a variable and not set it to any meaningful value; there has to be some way to check it. As mentioned before with primitives or Object type in some languages it would be necessary to check for -1 (if that denoted not set) or maybe even check if an object was set to null/nil if the language supported that concept (EK9 does not).
EK9 uses the ? (is set) operator in the following way.
- age ← Integer()
- if age?
- //This block would not be executed
- else
- //This block would be the section that is executed
The good thing about the ? operator is that it can be applied to any type of object. EK9 also provides a set of ternary and assignment coalescing operators specifically for dealing with variables that may or may not be meaningful values.
When you develop classes or records you can/should override the ? operator. This also applies to generics in the following way. While generics have not really been covered yet, they are intrinsic to how EK9 works and the built-in set of generic types really are first class constructs and so are treated on a par with Integers and Strings; for example.
- //A simple empty list
- aList as List of Integer := List()
- if aList?
- //Code where aList has some meaningful value/content
Incidentally if you have an initial set of values you need in a list then you can use the shorthand below. But there is more on this in the collection types section.
- //A list with some values using the 'list shorthand of' [] - this not an array!
- bList ← [ "A", "B", "C" ]
- if bList?
- //This block would be the section that is executed
This logic also applies to iterators (another supported generic type). Simple loop example getting each value via an iterator and converts it to a String to be printed on the standard output (console). But see introduction for alternatives to using Iterators.
- stdout ← Stdout()
- bList ← [ "A", "B", "C" ]
- iteratorB ← bList.iterator()
- while iteratorB?
- stdout.println("Value is [" + iteratorB.next() + "]")
- //The stream pipeline construct is cleaner though
- cat bList > stdout
In summary then; it is always best to declare a variable and assign it some space to hold a value, even if that value is not yet known.
Accept, for example that a Boolean can really have three states, true, false and importantly not yet known.
Assignment
The declarations above really combine the declaration of a variable name, its type (sometimes inferred) and its initial value into a single statement. In general, the creation of the variable and its assignment is always best done in one go like this.
With the easy syntax like ← makes this approach more likely to be adopted. It also is simpler on the eye to understand (low cognitive load), is less typing and encourages less variable reuse and better variable naming.
However there are times when you will already have a variable and just want to assign it a new value. The example below shows that syntax. But review ternary operators as there is a nice syntax that that makes a single assignment much more obvious.
- //Declare charges and initialise
- charges ← 20.99
- //Use charges for some processing
- ...
- //Now set the same variable to a new value
- charges := 50.99
- //Use updated charges for some processing
Note that the assignment was done using ':=', the assignment could have been done with ':' or '='. Depending on your point of view, the reassignment of a variable to a new value could be considered quite significant. Indeed, some developers would say you should never reassign a variable. Let's see what that would look like using the other assignment operators.
- //Declare charges and initialise
- charges ← 20.99
- ...
- charges: 50.99
- charges = 50.99
- //Using := 'feels' more significant
- charges := 50.99
EK9 does not stop you reassigning variables unless the block of code you do it in is marked as pure, then only return values can be reassigned. See the section on pure for more details. Pure is the main reason EK9 does not use keywords like const, final or static. It's also why it does not have immutable versions of Generic collection types. Pure is a very much stronger concept if developers are looking to embrace immutability in their code.
EK9 takes the approach of forcing the whole block (function/method) to be pure.
If the developer has concerns in this area they feel strongly
about; adopting a more pure approach
is the way to go.
For most developers this may feel a little extreme. But EK9 does mandate pure on some
specific operators to preseve semantics.
Comments
There are four types of comments that available in EK9 source code, these are show below. Please note that the C/C++/C#/Java style of comments /* */ are not supported.
One Line Comments
A single line comment is shown below.
- //A one line comment
- variable ← 21 //Comments can be at the end of a line if necessary
The single line comment starts with '//' and continues until the end of the line.
Block Comments
There are two types of general block comments, which follow the HTML/XML coding standards as shown below.
- <!--
- These lines will be contained
- within a comment block.
- -->
The block comment below is an alternative and more consistent mark up.
- <!-
- These lines will be contained
- within a comment block, just like above.
- -!>
Documentation Comments
This comment type is really only used for documentation comments.
- <?-
- Just sums the squares of values from 1 to n.
- -?>
EK9 took a different approach to comments from many other languages to encourage the limited use of comments. While this may sound strange, excessive commenting can be misleading and lead to too much visual clutter.
Pure Keyword
Whilst not a mainstream approach by developers with an Object-Oriented or Procedural background, the concept of not allowing/controlling variable reassignment is really a key element for those with a Functional Programming background. The idea of which is quite attractive in many ways and can lead to code that is much less error-prone.
This blog post has a little more logic and rationale as to why immutability is important but also quite hard to get right.
It is not without cost however, sometimes the solution approach you would have traditionally used cannot be applied. Sometimes you will need to rethink your approach to a problem and come up with an alternative solution.
Below is an example of a function that has been marked as pure; to a 'purist' functional programmer this would not qualify as being pure as there are a couple of explicit and implicit reassignments.
- sum - this is clearly being reassigned - EK9 allows this, as it is a return variable.
- i - the loop variable is obviously being incremented - EK9 allows this as well.
#!ek9 defines module introduction defines function <?- Just sums the squares of values from 1 to n. -?> sumOfSquares() as pure -> n Integer <- sum Integer: 0 for i in 1 ... n sum += i*i //An alternative solution using stream pipeline <?- Square of val. -?> square() as pure -> val as Integer <- squared as Integer: val * val streamingSumOfSquares() as pure -> n Integer <- sum Integer: for i in 1 ... n | map with square | collect as Integer //EOF
The stream pipeline offers an alternative approach which is more functional in nature.
But below, shows the advantage of marking something pure, as the following code would result in a compiler error (because there is an assignment to an incoming parameter variable).
- ...
- sumOfSquares() as pure
- → n as Integer
- ← sum Integer: 0
- n := 20
- for i in 1 ... n
- ...
It's important to know that pure can only call pure. If you have some existing functionality that is not marked as pure you cannot call it from your pure function! The converse is not true however!
Visibility
In EK9 there are several concepts of visibility that are important to understand, these are outlined below. Each of the different constructs have different capabilities/roles and purpose; as such the level of visibility of fields/properties and methods vary from construct to construct. While this may seem confusing there is a purpose to these different levels of visibility per construct.
Module Visibility
All constructs that are defined in a module (i.e. defines module introduction) are all visible to each other so function sumOfSquares (above) would be visible to any other function or construct in the same module.
References
However if you need (and you will) to access constructs from other modules there are two ways you can do this. The first way is shown below; and it is to use a references statement. Here a constant of PI is defined in a module namespace called net.customer.geometry.
#!ek9 module net.customer.geometry defines constant PI <- 3.142 //EOF
Lets assume that the developer now needs to access PI from another module namespace called com.solutions.areas and want to define a function to calculate the area of a circle.
#!ek9 module com.solutions.areas references net.customer.geometry::PI defines function areaOfCircle() -> diameter as Float <- result as Float: PI * (diameter/2)^2 //EOF
By using the references statement above the constant PI can be used within the module. The alternative mechanism to use PI is show below.
#!ek9 module com.solutions.areas defines function areaOfCircle() -> diameter as Float <- result as Float: net.customer.geometry::PI * (diameter/2)^2 //EOF
Here the fully qualified name of the constant PI is used. Why have two mechanisms to do that same thing? There are cases (PI would be a bad example) where you have a construct say a function that has the same name but in a different module. You cannot reference it because the names would clash, so the only way to access this is to use the fully qualified name. Clearly this is less convenient (especially if you want to use it in multiple places), so take care in naming of constructs; and also how you link them together.
The same syntax is used for components, records, types, traits, classes and functions. But note there are no wildcards you must reference every single construct that you need from other packages. This ensures that you are careful in packaging and re-use; moreover everything can be referenced. There is no mechanism to limit access to items in modules that are defined. This is aimed at simplicity; other languages have mechanisms that enforce hiding of constructs, EK9 does not support this.
The main focus of encapsulation in EK9 on a large scale is the component. Other constructs are considered to be building blocks to make the components. Clearly there is nothing stopping developer from using specific namespaces to denote internal constructs that should not be exposed. For example com.geometry.internal, but these are a naming convention and not enforced by the compiler.
Function Visibility
Just like constants all functions have public visibility. Again just like constants if they are to be addressed outside the module they are defined in then they must be either referenced or by addressed by their fully qualified name. com.solutions.areas::areaOfCircle in the example of the function outlined above.
As an aside, functions must have a unique name within a module. You cannot overload functions with the same names but different parameters (unlike methods on classes that do support this). You can have a function of the same name in a different module namespace however.
Record Field Visibility
A Record field/property visibility is always public, this means that when a record variable is
accessible all its fields are directly readable and can also be modified. As records only support fields and
operators there and there are no methods on records method visibility is not relevant for records.
But remember it is possible to use pure methods and functions to prevent modification of record fields if desired.
Records in EK9 are quite like structs from other languages, but with the addition of operators.
Class Field/Method Visibility
A Class field/property visibility is always private, this means that classes that extend super classes do not have any access to fields/properties in their super classes. It is not possible to create a field/property in a class that is protected or public. The inner workings of a class are private to that class. It is possible to make accessor methods to expose such inner workings; clearly this is undesirable from an Object-Oriented encapsulation point of view, but it can be done and the compiler will not stop this.
All operators are public and can only be public.
Class methods have the most variety of visibility, these values can be one of:
- private - only accessible within the class defining it.
- protected - accessible to class defining is and all subclasses.
- public - accessible by any caller, this is the default when nothing is specified.
Trait Method Visibility
Trait method visibility is always public and traits as do not have fields/properties their visibility is not relevant.
Component/Field/Method Visibility
Just like classes; component fields/properties are always private. Operators on components are also always public.
Component methods have the following visibility:
- private - only accessible within the class defining it.
- public - accessible by any caller.
There is no concept of protected access in components, they are intended to be composed of other building blocks such as classes, functions and other components and extension should be through composition and not inheritance. If you find yourself needing complex inheritance hierarchies with components you should look to pull that functionality down into classes. Moreover, you should probably consider using some of the composition mechanisms available in classes to make them more reusable.
You may consider this advice rather opinionated and in some ways it is. EK9 has many more specific constructs than other languages; these are designed and intended to work in concert with other constructs in specific ways. Go 'with the grain' of this, going against the grain will not result in a good outcome. If you find this approach too constraining/limiting/restrictive or irritating see the other languages section - you may find a language there that you are more suited to.
Parameter Visibility
Clearly parameters passed in to a function or method must be visible in terms of being read and used, but their reassignment or alteration can be controlled via the use of the pure keyword - see the pure modifier for more details.
If you recall in the section on objects, parameters are always Objects and hence are always passed by reference. This means that it is possible to modify their internal state (subject to the pure modifier), hence all parameters can be in out parameters.
This has major implications (both good and bad), it means that if you want to return multiple values from a function or method, you can pass parameters that you intend to be modified and updated. But it also means it is possible you may inadvertently modify an incoming parameter that you did not intend to. Another way to return multiple values is to use Dynamic Classes in the form of a 'tuple', or even use a record.
Just using the ':', '=' or ':=' assignment operators would not modify the internal state of an object as it would just alter the location of where the variable in the function or method would access. The original variable would remain as is. But using an operator (or mutator method) like '+=' or '++' would actually alter the internal state of the variable. In addition the copy operator ':=:' will alter the internal state; as would the merge operator ':~:'.
So by adopting an Object only approach and not supporting any pass by value semantics EK9 does open the door to potential errors; but as most real world software always involves complex aggregates rather than just a few primitive types (like int or float) this is an issue that already has to be carefully managed. EK9 provides the pure keyword so that a developer can manage value mutations.
Next Steps
To learn more on the language itself in terms of what operators exist on built-in types and how to provide implementations in your own classes/records; take a look at the operators next.