Development Status of Lexer

Versions

  • v0.2: has simple (poor) bash lexing and old (incomplete, partially broken) php lexing
  • v0.3: I don't know (broken, probably?)
  • v0.5: I don't know (broken, probably?)
  • v0.6: abandoned intermediate version, I BELIEVE (but not 100% sure)
  • v0.7: The up and coming version

Grammar Changes

  • Feb 14, 2022, PHP: reset $xpn on op_block_end() when ast type is block_body. See Php->testScrawl2()

Feb 11, 2022 ISSUE

  • phptest -test MethodBodyIsString ... the method body is showin gas an array when it should be a string ... i messed around with it a bit & at times it was returning an ast object instead of an array ... idk what the problem is ... ALSO, the third method has a body nested in the body ... which makes no sense ... so this is a whole heck of a thing, it seems.
    • i was accidentally running the test with cache enabled so ... whoops
    • i decided to just remove body because it wasn't working anyway ...
    • so i commented out a line in Operations that set 'body' on the ast
    • this needs to be investigated

Jan 26, 2022

  • the PhpGrammar tests have a bunch of commented out lines ... this needs refactored so I can properly run these tests with confidence rather than require manual review
  • I need to write unit tests for the string_backslash issue (and maybe others that I've solved, with operations & such)

Jan 25, 2022

  • work on the php grammar ... handles functions (not just methods), and fixes bugs with catching 'class' and 'function' keywords in the wrong context

Next

  • the string_backslash directive is supposed to handle backslash escaped things inside a string ... well it catches the first one, so a string like "iam \" a string \" with two quote marks" will fail on the second \". I tried to do a work-around in php, but that was a mess ... I don't think it's worth refactoring the internals ... I need to make a unit test just to test strings that contain backslash-escaped quotation marks. i THINK it's because of the order ... after the first time string_backslash is hit, the list of unstarted directives is changed (it getting stopped appends it, rather than putting it back in its original spot) ... that's 100% it

... i found a workaround for the string backslash ... added its own handler ...

... phtml classes are all passing the methdos count ...

I need to improve the generic test & its output ... its all just takes up so much space for no good reason

Known Issues (PhpGrammar)

  • when there IS a namespace in a file, class is nested under neath it. When there is NO namespace in the file, namespace='' and class=[] is adjacent, ... for now, i just have a workaround in the test code
  • the string backslash issue ... i'd rather not have it be a workaround in php ... idk how to fix it and its probably not a priority

Welcome back (Jan 24, 2022)

  • I can't parse a method with a body
  • See run/Documentation.php ... There is a test for PhpGrammarNew that shows very nicely how to run the lexer.
  • test/output/PhpFeatures.md contains a list of directive unit tests (their description, input code, and whether they passed their tests). Looking at this is a good way to see what language features have been implemented
  • PhpGrammarNew's test PhpGrammarNew_SampleClass is failing and i don't know why (it was failing before i moved files around, but i need to fix that too)

Things I want to do:

  • Restructure the test directory
  • delete old code that I don't need
  • disable tests that I don't care about (old grammars and such)
  • turn PhpGrammarNew into PhpGrammar & either delete the old one or rename the old one to PhpGrammarOld
  • structure the code/PhpNew directory ... the code/Lexer dir ... maybe just review all the files & try to structure things better ... code/ should have code/Grammars/...
  • genericize the running & testing of lexing full php files ... the idea is ... i WILL run into bugs while I'm working on other software. WHEN i run into a bug, i could just copy+paste the entire source code of the file into this genericized test suite & make it easy to view the ast output & make it easy to change what's expected
  • clean up PhpGrammarNew ?? I think delete the old code I don't need ... maybe write other tests? idk

PhpGrammarNew status

Current

  • Working on phptest -test PhpGrammarNew_SampleClass. Manually comparing test/php-new/SampleClass.tree2.php to test/php-new/SampleClass.php.
    • The class is not getting a docblock. Maybe I need to write a little test for this.

Next Major steps

  • Test that docblocks work on: (I just kind of assume they do...)
    • consts
    • properties
    • methods
    • classes
    • traits
    • DONE use TraitName;
    • namespace NS;

TODO (low priority)

  • Consider always setting namespace & (or at least) fqn on class...

Latest

  • Support return by reference on methods: public function &global_warming() (this method launches billionaires to space temporarily & returns a reference to an array detailing the CO2 pollution. (which is bad, because you can just lie and change it to less pollution, now that we're done with the space trip))
  • Support all arithmetic operators
  • added none operation target, which just stops the directive from being processed if it is that operation
    • Used by the /* operation so that the docblock directive still handles it
  • Refactored operation stuff & multi-char operations are now functional.
  • wrote operation_match() method to check if the current buffer is an operation. Simplifies matching multi-char operations.
  • Operators are refactored, so multi-char operators can be used
  • support simple array declaration
  • add Values trait to test value assignment
  • parse traits the same way I do classes (for the most part)
  • implemented: namespace Abc; class Def {} - get fully qualified name for the class
  • support & test php class use TraitName;
  • support & test php class consts
  • add a stop instruction set to php_code directive
  • catch <?php and ?>
  • Unit test static properties and methods
  • add minimal debugging output to the wd_ and op_ routing.
  • clean up notes
  • Property & method tests passing (& all other current tests)
  • Fix the methods are inside class modifiers bug
  • Add ability to -run Namespace.* and run any tests that wildcard match, basically.
  • Separate php directive tests into traits
  • Add word routing to wd_theword($lexer, $xpn, $ast)
  • Rename OtherDirectives to CoreDirectives
  • Refactored operations into Handlers, Operations, and Words
  • Methods are completely handled! (I think)
  • Implement almost complete for properties
    • Properties handle string concatenation, but NOT numeric operations
  • add operation routing to op_opname($lexer, $xpn)
  • New mechanism of operations & words

Lexer Status

Next

  • Finish the GrammarTesting documentation (once method bodies are available)

High Priority (internals)

  • Make DocblockGrammar work for bash (docblock starts with ##)
  • BashGrammar + simple tests. Catch:
    • docblocks
    • functions
    • maybe comments
    • nothing else

Low Priority

  • rename previous to meta or data. I'm thinking meta. & make it its own group if its not
  • Add meta information to ASTs & OPTIONALLY include meta information when getting an AST tree
    • Convert type to _type on ASTs maybe? (its meta information)
  • add additional command arg expansion features ([] and !)
  • abort feature:
    • Add an instruction that: Mark a directive as an abortion target
    • Add an instruction that: pop directives until a named abortion target is on the top of the stack
  • Threaded parsing. Starts as a single thread, then any instruction can create a new thread & now we'll continue processing on the main thread & process on the new thread as well. Ultimately only one thread can "win", so all but the "correct" thread will be discarded
    • move the while loop in Internals into its own function like do_the_actual_lexing($lexer, $token) or something.
    • Might just have to create multiple lexer instances to do this.
  • Performance: Use arrays, & do object conversions on directives at the last possible moment. I only want them as objects, so they're easier to work with by reference.

Latest

  • Handling nowdoc & heredoc <<<STR && <<<'STR' (somewhat lazily)
  • Rewrote strings directives (phpgrammar)
  • Fix bug (docblockgrammar): Empty docblocks led to infinite loop
  • Fixed Bug (phpgrammar): Nested blocks inside methods were not being caught, so methods were ending when nested blocks closed
  • Fixed Bug (phpgrammar): <?php class Abc extends Alphabet{} Fails to get Alphabet. Write a test for & fix it.
  • Move cmdMethods and the method map into their own trait
  • add :+directive_name as alternate to :_blank-directive_name
  • fix stop instruction, so it now ONLY stops directives if they're started in the top stack
  • add inherit/directive.inherit instruction
  • add then.pop instruction
  • Write Documentation tests & an Examples doc file & cleanup README
  • add namespace.docblock to file ast & update tests
  • Added modifier to const ast
  • Docblock grammar integrated into phpgrammar
  • Docblock Grammar passing
  • Fixed: Some lines would show * after docblock grammar was done parsing.
  • then instruction now allows you to specify grammar like then docblock:/*
  • Clean up DocBlockGrammar stuff
  • Wrote a docblock grammar in almost pure php
  • Add Tester class to simplify testing of many directives
  • Minor improvements to intuitiveness of lex api
  • remove inspect-loop param from lexer
  • Refactor Command Processing
  • Refactor instruction execution
    • Move the switch into its entire own function & provide more clarity about passed in args
    • Improve naming scheme by namespacing all commands. There are also namespace free defaults
    • Create separate methods to replace complex cases (for readability & maintainability)
  • Create command parsing function (executeMethodString())that handles things like _token:buffer or _lexer:unsetPrevious docblock
  • Disable bash grammar tests
  • Reorganize lexer's internals into traits & remove old functions
  • cleaned up PhpGrammar test.
  • Refactored names & namespaces of php grammar & directives
  • Deleted old, useless php grammars
  • Sorted PhpGrammar directives into traits
  • removed unneeded functions for phpgrammarnew & separated directives into a trait
  • Got PhpGrammarNew working for SampleClass.php
  • Got v0.6 working