Development Status of Lexer

Notes

  • PHP directive test:
    • phptest -test ShowMePhpFeatures for a summary of the directive tests
      • see test/output/PhpFeatures.md or view in the terminal
    • phptest -test Directives -run Directive.TestName
      • add -stop_loop 50 to stop after the 50th loop
      • add -version 0.1 to run without new code. Default in tests is version 1.0
      • see test/src/Php to create new directive test
      • see test/Tester.php::runDirectiveTests()
  • Generate Documentation (memory limit):
    • php -d memory_limit=900M "$scrawl_dir/scrawl" where $scrawl_dir is the path to Code Scrawl's bin directory

Bad stuff

  • function abc() use ($var){} uses method_arglist instructions. It works, it's just does not create the ideal ast structure.

Versions

  • v0.2: has simple (poor) bash lexing and old (incomplete, partially broken) php lexing
  • v0.3: I don't know (broken, probably?)
  • v0.5: I don't know (broken, probably?)
  • v0.6: abandoned intermediate version, I BELIEVE (but not 100% sure)
  • v0.7: The up and coming version
  • v0.8: get rid of old php grammar. Clean up tests. Document. Maybe some additional polish & niceties for running the lexer.

Grammar Changes

  • Feb 14, 2022, PHP: reset $xpn on op_block_end() when ast type is block_body. See Php->testScrawl2()

Nov 21, 2023

Output ASTs as code. I'd like a proof of concept that starts with ASTs and outputs PHP code. If time allows, I'd like to also allow output of javascript code. (I'm sticking with those two because I know them well)

I need to start with an AST, then add a method to it to output code. If it contains other ASTs, those will also have their output-code methods called.

I'll start with an empty class because I already have good parsing of classes. I'll then add in properties & methods with no body.

WHOO!! PROOF OF CONCEPT WORKS I did:

  • Create ClassAst, DocblockAst, and PropertyAst
  • Create get_php_code() & get_javascript_code() methods on each to return string code in the target language
  • Create test class Translate at test/run/translate/Translate.php with a hard-coded ast to js & php test, and a test outputting the code for test/output/php/tree/SampleClass.js.

Notes:

  • SampleClass.js outputs with the property value wrapped in extra quotes like dog = '"PandaBearDog"'. I'm using var_export() bc my sample ast in the first test doesn't have the string quote marks around the value. That AST is probably written incorrectly. Idk.
  • I'm tired.
  • I would like each of the new ASTs to be strongly typed & docblock-commented to describe what each property is.

Oct 21, 2023

Making a .bak file of the current PHP Grammar. Going to maybe make some major changes to the directives. Tempted to start fresh, but kind of know that's gonna be a bad idea. I want to eliminate all of the php components, except for more generic directives. I want all of the logic to be in the Directive definitions.

Oct 18, 2023

I'm basically done with CreateParser.src.md. I maybe should add a better example of a full Directive. I may want to move the breakdown of instructions into its own file. And maybe rename CreateParser. Also, I suppose I haven't actually touched on any of the PHP of creating a Parser. Yeah. I guess I'll need to do that. So yeah, there's work to do yet on CreateParser.src.md

I updated the readme slightly. Commented out the Use an ast & parse code documentation links, but those will need to be made eventually.

Oct 17, 2023

Documented in CreateParser.src.md. Did a little docblocking. Only 7 more instructions to document!!!

Oct 16, 2023

I worked on documentation. I started a new README (DON'T RUN SCRAWL!). I added .docsrc/CreateParser.src.md and documented a fair bit about how the lexer works, as well as several instructions. I documented ALL the instructions that are in the MappedMethods.php file. I need to still document all the instructions in the switch statement in Instructions.php.

At some point it might be nice to refactor this stuff. It's all pretty confusing. But, I'm doing fine going through it and writing documentation, so I guess it doesn't really matter. As long as the documentation is good, it should be fine.

I probably need a new / better format for documenting the instructions, but what I have currently in CreateParser.src.md is quite alright for now.

I didn't add or modify any docblocks, and some of the documentation I found was wrong.

Aug 10, 2023

started in-depth parsing of vars in method body. added some return types added some docblocks added \Tlf\Lexer\Versions for versioning code added $signal to lexer for expressing expectations simply. added has() to ast to check for a key

Production by default runs with the old lexing. Tests, however, by default use Lexer\Versions::_1 (i.e. 1.0)

To check the var parsing & new signal-stuff see:

  • phptest -test Directives -run Var.Assign.Variable -stop_loop 27
  • phptest -test Directives -run Var.Assign.String (i broke this some how, but it WAS working)
  • or to run directive tests without new code: phptest -test Directives -version 0.1
  • code/Php/Operations.php
  • code/Php/Words.php
  • test/src/Php/Vars.php

TODO

  • (May 13, 2022) LilMigrations fails parsing. I believe foreach() is the problem, starting line 105. Need to figure out / fix this.
  • (apr 27, 2022) php method body does not contain blank newlines. It should, though, so i can print it back out just as it is written
  • review documentation
  • add to documentation: test/output/PhpFeatures.md contains a list of directive unit tests (their description, input code, and whether they passed their tests). Looking at this is a good way to see what language features have been implemented
  • maybe remove nesting a php class inside a namespace ast ... i just don't like it
  • am i handling interfaces?
  • clean up the php grammar (remove old commented debug code & whatnot)
  • create passing BashGrammar tests
  • create an easy mode feature like: $ast = $lexer->easy_mode('php', $string_to_lex)
  • delete the old BashGrammar, JsonGrammar, defunct Docblock grammar,
  • document directives (use docblocks, but i don't think the lexer can currently get docblocks attached to array keys like i want for this)
  • make generic base grammar that works how the PhpGrammar does
    • maybe make a StringGrammar as well, since string features are often shared between languages

March 13, 2022

All my tests are passing! I haven't implemented all of the PHP features, but I have the majority of them handled. I haven't added anything that's new in 8.0 or 8.1 & I don't know how I'll handle versioning ...

My directive tests are cleaned up. All my grammar tests are cleaned up. Basically, this thing is now awesome, robust, trustworthy.

It will still probably need features added here & there, but it's finally at a point where I can count on it.

It would be nice (but not necessary) to write a code-scrawl extension that documents all the directive tests for each grammar.

I would like to add an "EasyMode" to simply do like ... $ast = $lexer->easy_mode('php', $string_to_lex)

And I want to clean up this document.

Eventually, I want to catch expression, so a script or function body would have an array of expressions like $var = 'whatever'; and return true; as well as breaking down ifs/foreaches (ones with block at least) as expressions containing expressions.

Eventually, i would like to break down each expression so $var = 'whatever' is generic like: type=set variable, name='var', value='whatever'

Eventually, all this expression breakdown could be used to transpile between different languages.

TODOs (old, but maybe still relevant)

  • Finish the GrammarTesting documentation (once method bodies are available)

High Priority (internals)

  • Make DocblockGrammar work for bash (docblock starts with ##)
  • BashGrammar + simple tests. Catch:
    • docblocks
    • functions
    • maybe comments
    • nothing else

Low Priority

  • rename previous to meta or data. I'm thinking meta. & make it its own group if its not
  • Add meta information to ASTs & OPTIONALLY include meta information when getting an AST tree
    • Convert type to _type on ASTs maybe? (its meta information)
  • add additional command arg expansion features ([] and !)
  • abort feature:
    • Add an instruction that: Mark a directive as an abortion target
    • Add an instruction that: pop directives until a named abortion target is on the top of the stack
  • Threaded parsing. Starts as a single thread, then any instruction can create a new thread & now we'll continue processing on the main thread & process on the new thread as well. Ultimately only one thread can "win", so all but the "correct" thread will be discarded
    • move the while loop in Internals into its own function like do_the_actual_lexing($lexer, $token) or something.
    • Might just have to create multiple lexer instances to do this.
  • Performance: Use arrays, & do object conversions on directives at the last possible moment. I only want them as objects, so they're easier to work with by reference.

Latest (pretty old list)

  • Support return by reference on methods: public function &global_warming() (this method launches billionaires to space temporarily & returns a reference to an array detailing the CO2 pollution. (which is bad, because you can just lie and change it to less pollution, now that we're done with the space trip))
  • Support all arithmetic operators
  • added none operation target, which just stops the directive from being processed if it is that operation
    • Used by the /* operation so that the docblock directive still handles it
  • Refactored operation stuff & multi-char operations are now functional.
  • wrote operation_match() method to check if the current buffer is an operation. Simplifies matching multi-char operations.
  • Operators are refactored, so multi-char operators can be used
  • support simple array declaration
  • add Values trait to test value assignment
  • parse traits the same way I do classes (for the most part)
  • implemented: namespace Abc; class Def {} - get fully qualified name for the class
  • support & test php class use TraitName;
  • support & test php class consts
  • add a stop instruction set to php_code directive
  • catch <?php and ?>
  • Unit test static properties and methods
  • add minimal debugging output to the wd_ and op_ routing.
  • clean up notes
  • Property & method tests passing (& all other current tests)
  • Fix the methods are inside class modifiers bug
  • Add ability to -run Namespace.* and run any tests that wildcard match, basically.
  • Separate php directive tests into traits
  • Add word routing to wd_theword($lexer, $xpn, $ast)
  • Rename OtherDirectives to CoreDirectives
  • Refactored operations into Handlers, Operations, and Words
  • Methods are completely handled! (I think)
  • Implement almost complete for properties
    • Properties handle string concatenation, but NOT numeric operations
  • add operation routing to op_opname($lexer, $xpn)
  • New mechanism of operations & words