Notice. New forum software under development. It's going to miss a few functions and look a bit ugly for a while, but I'm working on it full time now as the old forum was too unstable. Couple days, all good. If you notice any issues, please contact me.
|
Forum Index : Microcontroller and PC projects : PIC32 interpreter language features
Page 1 of 2 | |||||
Author | Message | ||||
MicroBlocks Guru Joined: 12/05/2012 Location: ThailandPosts: 2209 |
I am in the process of writing an interpreter for the PIC32 series. As this will be a 'new' language i would like to ask you all for suggestions. Especially functionality that seems to be missing in the current offerings of languages. The language style will be javasciprt/C/C# like but will leave out the tricky parts and not so usable parts for an embedded system. It has a bunch of primitive datatypes: One to seven bits: bool, bit, bibit, tribit, nibble, quibit, sebit, septebit Another possibility would be bit, bit2, bit3, bit4, bit5, bit6, bit7 What do you think will be the better choice? Unsigned integers: uint8|char, uint16, uint32|int,uint64 Signed integers: int8|byte, int16, int32, int64 I especially wanted to prevent using int, long, short and that sort of names because it is always unclear how big the integer on a specific system is. If i can squeeze the interpreter small enough it might fit an 8 bit or 16 bit processor changing all the lengths of those datatypes. Floating point: float|single, double (according to IEE754) Alternative names could be binary32 and binary64 Depending of the capabilities of the used compiler i might add decimal32 and decimal64 Strings: string,text String will be the one made first because it is old fashioned ASCII and similar as the underlying c language, text is planned for unicode. I might consider making these Objects instead of primitive datatypes. Special: bcd, bibcd, quadbcd,octbcd,hexbcd Again naming them bcd, bcd2, bcd4, ncd8, bcd16 is a consideration. assignments: =, +=, -=, *=, /=, %=, |=, &=, <<=,>>= the normal arithmetic operators: +,-,*,/,% unary arithmetic: - comparisons: ==, !=, >, >=, < ,<= Ternary operator: ? : Logical: &&, ||, ! bitwise: &,|, ^, ~, @<, @>, <<, >>, ->> @< is rotate left, @> is rotate right, ->> shift right with keeping sign It will have some native objects: Object, Math, String, Date, Array, Convert, System, Dictionary and some specific hardware objects, (this can be seen as a HAL) Uart, Ports, SPI, IIC, PMP, DMA, RTC, Interrupt, etc... and some objects that can be made for specific hardware: Canvas, SD, LCD, GLCD, Relais, Sensor etc... Statements: if/else, switch, while, do, for, foreach, repeat,try/catch/finally,function, class switch statement will have case,default,break,fallthrough repeat is the most simple loop that can be made, it will support continue and break: repeat(10) { ... statements; } There is no new statement. Each object has a create method. Object A = Object.create(); or shorthand: Object A = {}; or Object A = { int n:0, string result:"Oke" }; I don't have inheritence yet, but this will setup the interpreter to handle things like Object.create(mySpecialObject). The Array object is the one i had to think about a long time. There are so many ways to define an array it is hard to choose the 'best' one. Choices were: int[] a[10]; int[] a = new int[10]; etc... I choose for this: Array<int> A = Array.create(); The create can than be implemented to have arguments like size, initial values etc. The interpreter then can stay simple because it will hand over the arguments to the function that implements the Array object. Even if the Array object will change later, the interpreter can stay the same, making upgrades and shared development easier. However i think this form is also nice: Array<int> A = [1,2,3,4,5,6,7,8,9,10]; Don't have multiple dimensions yet but that could give interesting possibilities like: Array<int,string> A = [ [1,"a"], [2,"b"] ]; If someone has suggestions i am all ears. Microblocks. Build with logic. |
||||
isochronic Guru Joined: 21/01/2012 Location: AustraliaPosts: 689 |
Not a trivial project !! Are you going to write it in C, assembler, ? I was/is looking at learning pic32 assembler, but the chip architecture looks (ahem, cough) fairly complicated so far, a good reason to stay with the safer pattern of using C I guess. Which pic32's, what timescale ? |
||||
MicroBlocks Guru Joined: 12/05/2012 Location: ThailandPosts: 2209 |
I think it is best to write this in C. The microchip libraries are a good base for controlling the peripherals. I would like to make the interpreter extensible but first the framework for that has to be made solid. Adding "Objects" written in C or Assembler can then be added. Would be nice if you could load objects but for now i think i will be happy if i can add it to the source code. At the moment the tokeniser and parser are written in javascript. This will allow me to make an editor/simulator in a browser environment. The parser is of the top down kind and generates a tree. A flat version of that will be the output that can be interpreted by a pic32. My current goal is to get the parser to optimise this tree as much as possible. It already parses literals and manages scopes of variables. This shrinks the tree considerable. The final tree will make clear which tokens have to be interpreted. I have looked at a few bytecode interpreters but i don't like any of them. They use the stack for almost everything making it slow. Even for adding two variables a stack is used. The use of registers and allocated memory will be a lot faster. A necessity if you still want to use the peripherals. The most important feature of this interpreter is that it will be event driven. No 'super loops' will be necessary. Cooperative multitasking is then automagically included. No context/task switching is needed adding another speed improvement. Fast peripherals will use buffers and will place messages on a messagequeue. These messages will trigger events so that these can be handled. If you know javascript you would already be familiar with this. The group of people i would like to reach are the ones that develop software for the web. With an interpreter like i have in mind it will be a small step for them to start controlling things in the real world making web integrated hardware a much easier task. This interpreter is the basis of a range of hardware products that i have planned. Without good software hardware are just 'parts'. I am not sure of the memory requirements of the interpreter, depending on this it will run on any mcu that has enough of flash and ram. Of course the wish is to make it as small as possible. Will have to see what is possible. My estimates are that it will need at least 16k ram, flash probably 64-128k depending which peripherals are necessary. The timescale is difficult to estimate. It is my first interpreter so i will most definitely encounter some parts that are difficult to solve. It is one of the reasons why i have the tokeniser/parser in a very familiar environment. The final parser tree has to be as simple as possible and i think i will have that ready in about a month or 2. After that the interpreter for the pic32 has to be written. This will have the flattened tree as the source. The interpreter would have to be able to load a tree from SD or from a serial/usb port so it has to have some capabilities for that. If there is enough memory available another step could be to have an editor/tokeniser/parser in the pic32. It will then be much like an maximite, it can work as a standalone. vga/composite on a pic32 is already been done :) as proven by the maximite. I would choose to offload the vga/composite to an addon board so that the memory stays free for the program and it will enable smaller pics then the 6/795 to be used. Maybe even the 16 bits. Microblocks. Build with logic. |
||||
Bugs2 Newbie Joined: 18/05/2012 Location: United KingdomPosts: 29 |
Please could you write it to fit the 28-pin version of the PIC32. That would make it a powerful addition to the choice of available microcomputers for embedded applications. |
||||
MicroBlocks Guru Joined: 12/05/2012 Location: ThailandPosts: 2209 |
@Bug2, It depends on how small i can get the interpreter, but it is the intention. I have a few mx230/250 for testingt. Don't hold your breath though, it is still some months away. However the price difference is actually the only reason to go to a 28 pin version. If a breadboarded or through hole is you wish there are many breakout boards available for that. The interpreter is being developed to support a range of products i want to bring to arket. The whole cycle of breadboarding, software writing, debugging, prototyping and final product is being reviewed to make every step more accessible to the hobbyist and for small businesses that need small (< 100) production runs. I fall in the last categorie so this is also a project to make my future projects a lot easier. Microblocks. Build with logic. |
||||
boss Senior Member Joined: 19/08/2011 Location: CanadaPosts: 268 |
Greetings from Vancouver, I think that most of your requirements are fulfilled in eLua. I tested this interpreter on Cortex-M4 STM32F4 with excellent result. Unfortunatelly neither ARM nor PIC version is finished yet. But there is the source code available on Github and if I was a C guru like you, I would start there. See enclose file. Hope this helps. 2012-10-16_184055_Lua_52.zip |
||||
MicroBlocks Guru Joined: 12/05/2012 Location: ThailandPosts: 2209 |
Lua is actually for most mcu's not usable, you still have to make a lua 'environment' in C so that it can run. A few tries were made, but none was successfull, at least to my knowledge. Lua disadvatages: It is all meta, meta this meta that. Garbage collectors... brrr collaborative multithreading... brrrr dynamically typed....brrrr etc.. Problem i have with these type of languages is that it takes you too far away from the mcu. It results in inefficient memory use, threads and garbage collection will kill critical timings etc. An interpreter already has a weak spot and that is performance, when you make that weak spot even weaker by adding all of the above it becomes unusable except for simple tasks. I did my homework and for mcu's it comes down to a few choices and that is assemlby or C. And when timing is not critical MMBasic. For most programmers C is already pretty difficult. Add in all the stuff that you have to do to even get an mcu to just work, especially the peripherals with all the configuration and you have a environment that not many will be able to use. I would like to make all of that easier and hopefully reach speeds that are as close to C as possible. By using an event driven model there are no threads, no task switching, idle times, no unnecessary use of stacks, interupts can be handled very quick, etc. It will not be easy, but then again if it was easy it probably is not worth my time and it would certainly not be enjoyable. Microblocks. Build with logic. |
||||
CircuitGizmos Guru Joined: 08/09/2011 Location: United StatesPosts: 1425 |
"It will not be easy, but then again if it was easy it probably is not worth my time and it would certainly not be enjoyable." If you get 2 out of 3 you are doing OK. :-) Micromites and Maximites! - Beginning Maximite |
||||
boss Senior Member Joined: 19/08/2011 Location: CanadaPosts: 268 |
TZadvantage, I don't know what kind of Lua you tested, but my experiece with STM32F4 is completely differnt. Because Lua is working with program chunks the performance was ~2.000.000 lines/s, double precission is awesome, I found some disadvantages too - there is no full sreen editor,I2C doen't work yet,RTC is not working yet as well,...etc but, nothig really serious. But if you can do something better why not. Regards boss |
||||
MicroBlocks Guru Joined: 12/05/2012 Location: ThailandPosts: 2209 |
Something better, that is a question for what purpose it is made. Lua like langages are great for extending software capabilities. Sure Lua works great with double precision on an STM32F4 with an internal floating point unit. Lua is just as any other language depended on the hardware it is used on. Lua is also depended on the underlying program. A 210 DMIPS with a good support system written in C allows Lua to be fast enough. Lua is datadriven, dynamically typed (meaning it uses a hashtable for everything) and every value has to have metadata to distinguish its type, and uses the stack extensively to 'talk' to the underlying system. It is the worst case of doing things when there is not so much memory available. You found some disadvantages that makes it actually completely unusable on a mcu, again it is not a shortcoming of Lua but of the 'host'. My experience with it is that currently there are no working 'hosts' on any mcu, making Lua not fulfilling their 'promise' i2c does not work, RTC not work etc because the one who 'ported' Lua to the STM32F4 was not bothered by getting it to really work, meaning having control over the mcu. A very big fail in my opinion. Imagine MMBasic without the capabilities to have VGA, analog, digital ports, RTC, SD card etc. It would be pretty useless. A few PIC32 'hosts' for Lua failed miserably, just as the one on the STM32. Is that because writing a 'host' for Lua is not that straightforward as first thought? Interrupts, timers, uarts, other peripherals with interrupt capabilities etc sounds almost impossible to get that to the level of Lua. In the mcu world there are already complaints when it takes 6-7 cpu cycles to respond to a interrupt. Make that a few thousand when it is finally on a level that Lua can do something with it. It is the hardest nut to crack, and with my interpreter that will also be the biggest challenge. Taking the interpreter a level lower by allowing to have direct access to ports, memory, native datatypes etc the changes of getting a fast enough response are a lot better. Lua does what it was designed for and that is to work as a subsystem for an existing system. You have to write the modules (rocks) for that first. That means i can not use Lua on a PIC32 without writing the whole main system to support Lua. It is not meant to be a standalone interpreter, it just gives a higher language support to whatever is under it. If i have to make all of those peripherals work in C first and expose that to Lua, i can just spend a few hours more and finish the application in C. Instead of concentrating on the higher language in a mcu world you have to concentrate on the built in peripherals first. If there is no need for peripherals, there is no need for a mcu. Lua in my opinion is not developed with mcu's in mind but as a alternative for other object oriented languages, also as a language to be embedded within an application (there it really shines), just like javascipt or vbscript in ms office products. Microblocks. Build with logic. |
||||
jdh2550 Regular Member Joined: 16/07/2012 Location: United StatesPosts: 62 |
Hi TZ, Three parts: 1) This sounds like a great project. I'd be interested in "following along" as you work on it. If you need a beta-tester count me in. I'm a software engineer with 20+yrs of experience - I'm enjoying using the 'mite with MMBasic and I've added some stuff to MMBasic so it's actually a good environment for me (because C doesn't phase me!). However, a more modern higher level language would be good. I had been considering looking more closely at MMBasic and seeing how feasible it would be to separate the language/interpreter and creating a HAL (hardware abstraction layer) so I could work with the HAL and experiment with switching in different languages (LISP anyone? -- just kidding!) 2) It's great to see interpreted languages making headway in this area. It's not just about accessibility to "new programmers" - it's about how well interpreters match the experimenters development cycle. Even though I'm plenty experienced with C/C++ I use the 'mite more than the 'duino because of the interactive / iterative nature of my work process. Interpreted languages in general are making a massive comeback now that we have processors (and mcu's) that are way more powerful. It's a good time to be a geek! The 'duino made mcu work accessible to me, the 'mite is making prototyping quicker. I'm even thinking of trying to design my own two-channel CAN shield. 3) And now, to answer your original question. Go with the number versions of the commands. Hands down winner in my mind (i.e. bit2 not bibit). Of course if it were C I'd simply #define things to avoid the confusion you mention about long, int etc. I also think that more people will understand bit6 than sebit. I look forward to following your progress. And, when you're ready, tell me the "easiest" way for me to get a PIC environment to try out / test your code. |
||||
jdh2550 Regular Member Joined: 16/07/2012 Location: United StatesPosts: 62 |
A couple of observations: [quote]There is no new statement. Each object has a create method. Object A = Object.create(); [/quote] How do I delete an object? Is there dynamic memory management? Why "create" instead of "new"? (just curious) I see you've ruled out Lua as unsuitable. Have you any experience with Ruby? Could a stripped down eRuby be a contender? I can appreciate that an interpreter for an mcu has unique requirements given the domain space - however, a "new" language will turn-off some amount of your potential user base. Just a thought. |
||||
MicroBlocks Guru Joined: 12/05/2012 Location: ThailandPosts: 2209 |
In this interpreter until now Objects are the HAL. Syntactically they can be used as Objects in javascript or any other language that uses the "." Every object can then be called like this: HAL.UART1.send("Hallo"); or UART1.send("Hallo"); "HAL" is optional, like when you use javascript in a browser where "window" is optional. Event driven will make this possible: UART1.onreceive = receiver; function receiver(buffer, count) { } The parser is now capable of parsing anonymous objects. It is currently the only 'software' object. It will make encapsulated values and returning values a lot easier like: return { x;5, y:10}; As this software Object has no methods, private/public etc it is a pretty simple but usefull one. I still have to make the decision if the whole OOP will be functional, prototypal or with classes. As this will be a very hard part , i will leave it for version 2. If i get the HAL to work and the interpreter can interact with the hal good i will be more than happy. :) With an mcu you have peripherals sharing pins. Every object in the HAL will have a create() method, on calling that it will reserve memory, buffers, etc whatever it needs. A destroy() method will have to clean up and release resources to memory (and also IO pins to be used for another function) The create() will need to check if those io pins are available and if so lock them to prevent other uses. All the 'objects' in the HAL will need buffers and/or ways to respond to hardware events. That can be a timer for slow events or interrupts for fast events. This will all have to be done in C. It would be super if they can be loaded from an SD or other memory. I settle for in flash for now. :) The interpreter will get events from this HAL and has to start interpreting the part of the parsed tree that is bound to that event. It can only do that when the interpreter is in 'idle'. The events from the HAL will be a queue that will overwrite similar events. For instance if a character is complete in a UART receive buffer the HAL can push a message on the queue like (onreceive, buffer, 1). The buffer will then contain only 1 character. If characters are coming in more quick then they are processed it will overwrite that message with a new one (onreceive, buffer, 2). The buffer will then contain more than one. This mechanism will give the interpreter time to catch up if it was busy with something else. The HAL will be responsible nothing is lost. (Unless of course the interpreter is too slow and overruns occur), If RTS/CTS or XON/XOFF is used that can be prevented. I have not even started with writing stuff for the HAL. I welcome support from others. Microblocks. Build with logic. |
||||
mattma Newbie Joined: 01/10/2012 Location: AustraliaPosts: 25 |
Hi What a great project! I'm really looking forward to watching it evolve. With the number of people who have been running into memory issues where strings seems to be one area for creating a bit more space I was wondering if you had thought about implementing something analogous to SQL VARCHAR or NVARCHAR where, I understand, no space is reserved for future increase in the string size beyond the current assigned value. Then when you add to the string, the existing allocation of memory is released and a new memory allocated.I believe that C# does the same. This could help free up reserved memory that may never get used? I know you mentioned it was slated for a possible future version but would like to flag strong interest in loadable objects or maybe a swap file. Great stuff. Cheers matt |
||||
djuqa Guru Joined: 23/11/2011 Location: AustraliaPosts: 447 |
Even better idea Lessen the dependence on Strings. Most programs that use a lot of static strings can be re-written to reduce the usage of strings to a minimal number of re-used variables and constants. VK4MU MicroController Units |
||||
MicroBlocks Guru Joined: 12/05/2012 Location: ThailandPosts: 2209 |
Strings are not really a problem. It is the concatenation of strings that makes it difficult or time consuming. Lua as one example has immutable strings (immutables are the norm in functional programming languages), not really what you want in an mcu. Javascript has sort of the same problem. It allocates room for a string and once you add something to it, it creates a completely new string leaving the old one to be garbage collected. C# also the same. You can see the pattern. All dynamic and OOP languages have the same basic problems. It is hidden because of large amounts of available memory and lots of cpu power. But repeat operations enough and it will show its inefficiencies. If you want to use fast string concatenation you have to go around the basic string handling. in javascript you better use an array: like this: var html = []; html.push('<div>'); html.push(someString); html.push('</div>'); element.innerHTML = html.join(''); Example is a little simple but imagine you have that in a loop iterating many times. compared to html += someString many times it is hundreds if not thousands of times faster. in C# when you use strings it is much better to use a stringbuilder. Same results, much faster. A stringbuilder can have an initial size. This would be a good candiate to use in a mcu environment. You could estimate how long a string will be and do like this: string text = String.create(1000); This would than reserve 1000 bytes + some interpreter overhead. string text = "Hello world"; This would reserve 11 bytes + some interpreter overhead. The overhead would be 3-4 bytes for the data and about 4 for the variable pointer and one time a lookup table entry for the variable name, consisting of a key (length of the name) and a pointer 32 bits/ maybe 16 or 8 depending where it can be stored. Create method can have some extra arguments to allow for growth, steps to increase, max size etc. This would then all be encapsulated by a function written in C. malloc,realloc etc will then be from the heap using only what is necessary. If the programmer sizes the strings right it will prevent reallocations, speeding things up. In the underlying C you only have to add to the end of what is used and increase a counter. By not using C's standard of a null terminated string you would be able to add nulls in your string. Very nice when you use it as a buffer. There are many ways to handle strings. I think how c# does it with a stringbuilder comes closest to what i want. Maybe preventing the use of a + with strings can push a programmer to use the stringbuilder. It will instill good practices. :) Again as with everything you use, you should release the resources. text.destroy() or something like that syntax, or text.release(). If it is to be reused, what will happen most of the time you could do text.clear(); This will then just set the length to zero but keeps the memory allocated. Having control like that is crucial in an mcu. Finetuning can be done from a higher level language by giving it access to the underlying model through methods/attributes. btw if variables are used within a function, that function will create a scope hashtable where all the local variables live. when the function is finished this scope gets cleaned, not released! The difference is that when functions are called many times, the allocation of resources is already done, causing a large increase in speed. This obviously only for functions that are not recursively used. Those will create a new scopes every time and will release when finished. To detect recursion all that needs to be done is walk the tree of scopes and see if it is already inside the function. Tail calls would be nice. Version 3? :) The road i am taking with this interpreter is 'old school'. Nowadays with everything swimming in large amounts of memory and amazing amounts of cpu power makes many programmers unaware of what is taking place underneath. I have seen the results of the younger generations and with a few exceptions they really don't know what is happening and are completely taken by surprise when i show them an alternative way, using 10% of the memory and my highest score yet about 10000 times faster. Instead of installing eight or nine extra servers, just one is enough. I have worked for a few companies that needed near realtime processing, stockmarket quotes,news feeds etc. Websites that get a million hits in 10 minutes, that kind of areas. I was surprised to learn that my 'old' ways were and are still very valid. And i find that is again the case with mcu's. With mcu's the use of "trampolines" is a good way to maintain speed and cross boundaries between assembler,C and an interpreter. For a good primer about trampolines and its uses see DRDobbs trampolines For applications and websites where speed and resources are not a concern objects and dynamic features galore. :) Microblocks. Build with logic. |
||||
JohnS Guru Joined: 18/11/2011 Location: United KingdomPosts: 3802 |
Trampolines - what a funny name for techniques which go back decades! I don't know the earliest OS using them but for sure DEC's RSX-11M did in about 1975. I think it was in about V3 by then. Unix used something similar, at least by 6th Edition (*). They're so obvious that I don't recall anyone giving them weird names. (*) Anyone know John Lyons and his "underground" commentary? John |
||||
MicroBlocks Guru Joined: 12/05/2012 Location: ThailandPosts: 2209 |
:) Old jump tables and vector tables. Nothing new. But you have to give them credit for inventing good names. What do you think about 'detours'. :) Same thing. In practice not much has changed. Only cranked up the speed and optimizations on a hardware level. Software just keeps on moving further and further away of what is underlying. Good for many tasks, not for others like controlling hardware. Looking at what a compiler produces, no matter if it is C(++), C#, java or whatever. It is still code with registers, stacks, loops and jumps. They are called virtual machines, bytecodes etc. Fancy names for time consuming layers. Microblocks. Build with logic. |
||||
JohnS Guru Joined: 18/11/2011 Location: United KingdomPosts: 3802 |
Linux uses trampolines but they are something else! Great idea to have 2 different things in software with the same name... John |
||||
graynomad Senior Member Joined: 21/07/2010 Location: AustraliaPosts: 122 |
[quote]bit, bit2, bit3, bit4, bit5, bit6, bit7 [/quote] I also vote for these versions but please keep them orthogonal, ie bit1 not just bit. Arduino and other have stuff like Serial, Serial1, Serial2 etc. It's really annoying. [quote]@< is rotate left, @> is rotate right,[/quote] Do you ever need rotate through sign in an HLL? I can see a good argument for rotating the bits in a variable, say when driving the cathodes of a MUXed LED display where you seed the variable with 1 then let it go round and round. But when would you need to use the sign flag? [quote]Array<int> A = [1,2,3,4,5,6,7,8,9,10]; [/quote] Maybe things like Array<int> A = [1..10] Array<int> A = [1-10,15,30-35] Maybe associative arrays? Or am I getting carried away :) A "boolean" data type? Rob Gray, AKA the Graynomad, www.robgray.com |
||||
Page 1 of 2 |
Print this page |