Uncle Bernie's Rendering of Microsoft 6502 BASIC

14 posts / 0 new
Last post
Offline
Last seen: 1 day 18 hours ago
Joined: Apr 1 2020 - 16:46
Posts: 1272
Uncle Bernie's Rendering of Microsoft 6502 BASIC
AttachmentSize
File 7ZIP File containing the Linux tarball295.39 KB

As announced in this thread:

 

https://www.applefritter.com/content/ms-basic-1977-freed

 

I've been working on a rendered version of Microsoft's 6502 BASIC source code for quite a while (maybe  2 hours each day for > 4 weeks).

 

This rendered version is meant to work smoothly with modern macro assemblers which don't understand the DEC MACRO-10 assembler syntax in which the original source files were published by Microsoft.

 

Attached is a beta test version of the current state of my work and you are invited to try it out.

 

The 7zip file needs a trivial password to open and out comes a tarball for Linux.

 

If you are interested in testing this work and possibly porting it to a 6502 macro assembler of your choice, send me a PM (the send PM button) to get the password - sorry for the inconvenience with that, but I have good reasons for this:

 

First, I want to know how many people have downloaded it, to assess the overall interest in this work, and secondly, to avoid theft by so-called "AI" which then would pretend it did the rendering work all on its own, and maybe even sell it for money / "access fees".  See my P.S. below for my take on theft by "AI' - I have no solution yet how I could put this work up on github without it being stolen and misappropriated by "AI" which is known to scan all such public repositiories on a regular basis. But I want to give this work to the world for free, no strings attached, so that the retro computing crowd can benefit from it. I just do not want that "AI" steals my work and claims it (the "AI") made it.

 

Modern times.

 

In any case, if you are a human, you are invited to get, test, read, and expand my work. You can even sell it, if you want.

 

What I expect from the beta testers is to find out (and report back to me) if they are able to unpack the tarball and then get the contents going on their machine. It's not as trivial as it sounds, because the DASM assembler must be upgraded (at least two lines of "C" code added to it) to make it able to produce the binaries for Microsoft 6502 BASIC. I'd also like to know if you are willing and able to port it to other macro assemblers, such as MERLIN32, and still get matching binaries. I was able to achieve this match using DASM but the whole source as such, even after my clean up and rendering for modern macro assemblers, is a bit ambigous and possibly problematic at some few places. I'd also like to know if anyone was able to use the provided material under different operating systems, other than Linux, and what would need to be done / added to the tarball to make it useful for that other operating system. Note that DASM can be compiled under many other operating systems (at least the DASM team claims this) and I provided a hexdump tool as source code. The only remaining possible issue is how "other" operating systems could compare the hexdumps and automate the process as I did.

 

Comments invited !

 

- Uncle Bernie

 

P.S.:  IMHO, the whole "AI" scam is based on stealing intellectual property and works of art on which the "AI" is being "trained" to enable it to plagiarize the original work and make its "own" versions which  obfuscate the original source(s) and authors / artists. This is not only unethical, but a crime, except that the lawmakers have not yet caught up with the technological progress making this theft possible. Right now, numerous lawsuits against "AI" plagiarism are in the courts (internationally) and the first ones (involving theft of song lyrics) have been won by the plaintiffs, and the "AI" companies lost the lawsuit(s). And rightfully so. That "AI" scam, if not stopped dead in its tracks by such legal actions and by updating laws protecting Intellectual Property and Works of Art, will have a catastrophic effect on creators, authors, inventors, and artists. The first completely fraudulent and artificial, "AI"-created "bands" have appeared on youtube and found millions of views. For "AI" created low quality slop. Youtube is also flooded with "AI" created slop which regurgitates the contents of honest content creators using "AI" obfuscated background pictures and a "AI" created narration based on the original content made by real humans. It is really spooky - I've analyzed original content against the "AI" derived plagiarism and the narration of the "AI" voice is almost the same text as in the original, spoken by humans. So "AI" can translate human speech to text, obfuscate it by changing a few words, and then "speak" again with a human sounding "AI" voice. The fake "AI" generated "bands" even have "singers" and "AI" generated human-looking "musicians" pretending to "play" their instruments. A bit more "AI" progress and all actors in the movie industry (but not for stage performances) will be out of a job. I could go on forever but you can see the threat now, I hope. And note this: a recent MIT study has shown that 95% of the companies who tested "AI" to replace humans were disappointed by the poor and faulty results. On other words, despite of $30-$40 billion invested, 95% of AI adopters found out they get no return from deploying current AI. Here is the link:

 

https://mlq.ai/media/quarterly_decks/v0.1_State_of_AI_in_Business_2025_Report.pdf

 

Hope you see the big danger. "AI" is able to wipe out human creators, authors, inventors, and artists by quick and cheap mass production of low quality plagiarisms and slop which only appeals to the feeble minded and those who can't discern quality from cr@p. I think that this is an immediate threat to our civilization and a road to ruin. Watch the Y2006 movie "Idiocracy" ... it was meant as a comedy but, alas, it's not. "AI", if not stopped, will get us there sooner than you think. And "Meritocracy" will be gone, forever.

mmphosis's picture
Offline
Last seen: 1 week 3 days ago
Joined: Aug 18 2005 - 16:26
Posts: 463
redacted
redacted
Offline
Last seen: 5 hours 38 min ago
Joined: May 16 2021 - 08:07
Posts: 49
compiles

The upgraded assembler (with UB mods) compiles for me on Windows.  Details:

 

I tried the "big 3" compilers, namely GCC, CLANG, and MSVC.  Only GCC works, the reason is DASM includes the plural `strings.h` (as opposed to singular `string.h`).  How hard it would be to "fix" this I'm not sure.  Apparently (based on quick search) the plural one has some things that are not in the standard.

 

I used a command line workflow from powershell.  Dependencies were:

CMake - install using winget

Ninja - install using winget

GCC - installed using scoop

 

The things that I tried but did not work were:

LLVM - install using winget

MSVC - installed along with Visual Studio 2026

 

I also was able to build it using Meson as an alternative to CMake.

 

Offline
Last seen: 5 hours 38 min ago
Joined: May 16 2021 - 08:07
Posts: 49
more dasm compiling

Digging further, when DASM includes `strings.h` they mean it, you can't simply replace with `string.h`.

Furthermore they include `unistd.h`, another platform specific file.

So getting MSVC to compile it would probably not be fun, I think I will just proceed with GCC.

Khaibitgfx's picture
Online
Last seen: 1 hour 30 min ago
Joined: Jun 29 2019 - 20:02
Posts: 296
Loss of Jobs...

An interesting opinion...

 

https://www.theguardian.com/us-news/2025/dec/28/bernie-sanders-artificial-intelligence-ai-datacenters

Offline
Last seen: 5 hours 38 min ago
Joined: May 16 2021 - 08:07
Posts: 49
all works

Success on Windows with all 4 targets.  My workflow was as follows:

 

* generate`ub_dasm.exe` with scoop/GCC

* Rewrite `UBtest` as powershell script `UBtest.ps1`

* Rewrite `testrun` as powershell script `testrun.ps1`

* Rewrite `make test` as powershell script `testall.ps1`

 

The only part of this that really took any time was `testrun.ps1`, because it generates the hex strings for comparsion, and the native `Format-Hex` thing has to be manipulated to allow direct comparison with the reference data.  I also spent time reading the DASM manual but that really is optional.

 

In this workflow the C program `hexdump.c` was not needed as it is taken care of by `testrun.ps1`.

 

I may try it all again on Mac in some few days; I anticipate using homebrew GCC to produce `ub_dasm`.

Offline
Last seen: 5 hours 38 min ago
Joined: May 16 2021 - 08:07
Posts: 49
Merlin32

Conversion to Merlin 32 is going quickly, so far the most troublesome thing I see (other than busy work) is that Merlin might forbid some forward referencing that has been used

Offline
Last seen: 1 day 18 hours ago
Joined: Apr 1 2020 - 16:46
Posts: 1272
Nice progress !

In post #7, 'dfxgordon' wrote:

 

" Conversion to Merlin 32 is going quickly, so far the most troublesome thing I see (other than busy work) is that Merlin might forbid some forward referencing that has been used "

 

Uncle Bernie comments:

 

This is good news ! Because Merlin 32 seems to be "the standard" assembler for the Apple II crowd.

 

The forward referencing issue is a bad surprise to me, as any professional grade macro assembler should be able to handle that.

 

I can only guess where this happens (as I did not succeed to compile Merlin 32 from the sources and I also refuse to use tools which only come as binaries, got burned too often, not "viruses", but the tool stopping to work after Linux updates / upgrades / move to another machine, and without compile from source the tool is lost --- I have some 20+ year old Linux machines, too, and the "important" tools must run on all of them).

 

So, my guess it's only with those labels who are set to another label which is unknown at the time, like this pattern:

 

LABEL1 = LABEL2

 

where LABEL2 is somewhere downstream in the source code. You can try to fix this by making LABEL1 a transitory label (which can be reassigned), assign a dummy value to it of the right size (zero page or non zero page) in PASS1 and then assign the LABEL2 to it in PASS2. The DEC Macro-10 assembler had the conditionals IF1 and IF2 to discern between the two passes. I don't know how this could be done in Merlin 32, though, as I don't have it.

 

Good luck with your migration effort, which is much appreciated !

 

(Oh, and some volunteer should also migrate this work to modern Apple computers using macOS, if there is a free / open source "C" compiler available for it (you see I have no Mac ;-)

 

- Uncle Bernie

Offline
Last seen: 5 hours 38 min ago
Joined: May 16 2021 - 08:07
Posts: 49
update

Thanks UncleBernie for the notes, and for your rendering which makes this far easier.

 

In defense of Merlin it does handle forward referencing, there are just some limitations applied in specific situations by legacy Merlin and probably Merlin 32 also, but at any rate my interpretation of the errors is still developing.  The error I'm getting now is a detection of recursions in an equivalence substitution, and what I found is I can suppress that error if I comment out the counter increment that occurs in the dextral character inversion macros.  Still working on why this happens and least disruptive workaround.

 

One issue is Merlin 32 only targets 65816.  Why is this a problem?  Well, something like `LDA [expr]` will be interpreted by Merlin32 as direct page addressing, while DASM would regard the square brackets as part of an expression.  I think Merlin 32 developers are planning to provide a feature to target 6502; legacy Merlin 16+ can already do it, so it might actually be better to assemble on a IIgs! (to be clear the square brackets must go in  any Merlin, it's just whether the assembler can help you find them or not)

 

I did notice the PASS conditionals and I am worried about them.  I notice it only comes in for REALIO=3?

 

 

 

Offline
Last seen: 5 hours 38 min ago
Joined: May 16 2021 - 08:07
Posts: 49
note on fwd refs

Three-line test program shows where there can be an issue

 

]PET = LBL

LBL LDA #0

    JMP ]PET

 

Merlin 32 fails to assemble that.  But if the variable (indicated by leading right-bracket) is replaced with a global like this

 

PET = LBL

LBL LDA #0

   JMP PET

 

then it works.  However this cannot be done everywhere because some the the equivalences are indeed variable.

 

Merlin 8 gives "illegal forward reference" in either of the above cases.

Offline
Last seen: 5 hours 38 min ago
Joined: May 16 2021 - 08:07
Posts: 49
suspending for now

Well, it seems there is a bug in Merlin 32 that is behind much of the trouble, it doesn't handle nested conditionals in all cases, and there are plenty of those in this code.

 

I filed an issue https://github.com/apple2accumulator/merlin32/issues/60.  Until that is fixed it may be necessary to hold off on this, or else maybe take seriously the idea of using Merlin 16+ on an emulated IIgs until such time as M32 is fixed.

 

 

Offline
Last seen: 1 day 18 hours ago
Joined: Apr 1 2020 - 16:46
Posts: 1272
On the issue of assembler quirks and bugs:

In posts #9 and #11, 'dfxgordon' wrote:

 

" I did notice the PASS conditionals and I am worried about them.  I notice it only comes in for REALIO=3 ?"

 

" Well, it seems there is a bug in Merlin 32 that is behind much of the trouble, it doesn't handle nested conditionals in all cases, and there are plenty of those in this code."

 

Uncle Bernie comments:

 

I introduced the "PASS" variable to replace the IF1 and IF2 conditionals of DEC's MACRO-10 in Microsoft's original source code. PASS is set immediately before being used to the state it should be in to generate correct code, so no need to worry. If your assembler does not allow that, just put a constant 0 or 1 into the IF.

 

About the '[' ']' brackets being used in lieu of parenthesis, this is a common and sad feature of most 6502 assemblers, and I hate it, as it is a very ugly crutch. With just a little bit of effort in the parser it would be possible to continue using regular parenthesis for precedence in expressions, even when allowing the same parenthesis for indicating indirection in 6502 operands.

 

IMHO the root of this and other problems with these assemblers is that most of the "open source" / "free" 6502 assemblers have initially been written back in the 1980s and 1990s by "poor" college students whose programming skills were not yet fully developed ... compiler construction classes come later in the curriculum and unless you use LEX and YACC to build scanners and parsers, all sorts of unpredictable behaviour of the assembler can and will happen.

 

As far as I am concerned I looked into the source code of all the 6502 assemblers I use and was horrified, none met my professional standards, terrible coding seen here and there, but what the heck, it's "free" and can be downloaded for nothing, and most do compile out of the box (some don't, depending on the Linux version you use) and most work well enough to be useful, but all of them have shortcomings, quirks and bugs which must be known and dodged.

 

THE BEST 6502 ASSEMBLER (IMHO)

 

The best 6502 assembler I ever did use was MAC/65 by Optimized Systems Software, a bank selected cartridge for the Atari 8 bit, and it was /is a professional grade macro assembler that never let me down. It was VERY expensive for me at the time I did buy it but I used it to build products which were successful and are the foundation of my wealth. I could have paid 100 x the price for MAC/65 and still would be very happy about the "fruits" it helped me to grow and harvest ! But I think this illustrates the point: unless it's productivity software that makes them lots of money, most people do not want to buy expensive software tools. They must be "free". I'm in that group, too. But If I needed a $100k CAD suite (Cadence ?) to design an IC which would net me $1 Million, I'd gladly buy this CAD suite.

 

ATASM - "ALMOST" MAC/65 compatible

 

The ATASM assembler I did use later (on Linux) - which is very compatible with MAC/65 - works fine despite it seems to be another college student project from the 1980s/1990s and it's not maintained anymore, last time I looked it up on github it has not been touched for more than 9 years, but it does have the fatal bug (for working with Microsoft BASIC source) that it crashes with a "segmentation fault" whenever live code is assembled into the zero page.

 

THE MOVE TO DASM

 

This is why I moved to DASM for this work. And except for the lack of the DCX directive (which can be added by adding two lines of code) it worked fine for me. It seems to be quite robust and it handled all the tricky things in the Microsoft 6502 BASIC source without a hitch. Even the .IF / .ELSE / .ENDIF being nested 6 levels deep (or more).

 

MERLIN 32 - A DISAPPOINTMENT

 

Thanks to you, 'dfxgordon', for being so bold to try out Merlin 32, and wasting your time with it - I did not spend more than 15 minutes on it when it did not compile on my machine, which always is a bad sign, as I do keep one of my Linux machines current and upgraded. The older machines I keep as they are, they never get any updates, so I can still compile older "C" programs. But they never get connected to the internet, too.

 

ANY OTHER OPTIONS ?

 

An Applefritter member did mention somewhere that he would like to try his homegrown 6502 assembler with the rendered Microsoft 6502 BASIC source code, but I lost track who that was and where it was posted or messaged.

 

CONCLUSION

 

Seems for the moment we are stuck with DASM.  Alas, I don't have the time to write my own macro assembler. I once wrote my own C compiler (with some special enhancements / capabilities) and this stole one year of my life. And never made me a dime, despite it worked like a charm. Don't wish to repeat this exercise with writing yet another 6502 macro assembler ("YA65MA" ?)

 

- Uncle Bernie

Offline
Last seen: 5 hours 38 min ago
Joined: May 16 2021 - 08:07
Posts: 49
about parsing

UncleBernie since you mentioned the parsers I want to take an opportunity to relay something I found interesting in developing parsers for Applesoft and Integer BASIC.

I chose Tree-sitter because syntax highlights were the initial motivation.  For Applesoft, it was necessary to go to some considerable lengths to force Tree-sitter to be dumb enough to parse Applesoft the same way the legacy MS code does it (which is necessary if you want the highlights to accurately show the historical way the code would be interpreted).  Specifically an external scanner was needed to force context to be ignored under certain conditions.

Integer BASIC on the other hand could be parsed in a rather straightfoward way using Tree-sitter, no external scanner needed, and modest LR(1) conflicts that could be handled with built in mechanisms.  So Integer BASIC seems to be mostly LR(1).  Did Woz instinctively build a canonical parser without realizing it?

Full disclosure I am not formally trained in this myself. I try to make up for it with lots of testing.

 

Happy New Year!

Offline
Last seen: 5 hours 38 min ago
Joined: May 16 2021 - 08:07
Posts: 49
Works on Mac

Arm64 Mac mini and OS Tahoe 26.2, the whole process took about 30 minutes.

 

* The modified DASM assembler could be compiled without any trouble using default Xcode command line tools (7 tests fail but we are told overall result is PASS).

* The upgraded assembler test `UBtest` does not quite produce 0-diff:

  - reference line: INITIAL CODE SEGMENT         0000                            0000                

  - test line: INITIAL CODE SEGMENT                  0000 ????                       0000 ????

* It was necessary to replace `<malloc.h>` in `hexdump.c` with `<memory.h>`

*`make tests` gives `*** MATCH ***` for all 4 cases

 

Log in or register to post comments