Sunday, June 29, 2014

Interfacing Wiz550io to GA144

(aka: An Internet music player pt4)

Spent a week writing some low-level Forth words (subroutines) to drive the SPI interface on the wiz550io. Then realised that most of them already exist in the ROMs included with nodes 705 and 008 on the GA144.

Most of my time was spent trying to understand the heaps of C code out there for Arduinos and other microcontrollers. Wiznet has released source code for its drivers and, because it has to cover all usage cases, it is incredibly detailed and wordy. I quickly decided that for the music player I would only ever be reading and writing multiple bytes so there is no need to handle the Fixed Data Modes. Likewise I will only be supplying a single response to any client request: a stream of HTTP packets containing MP3 data. (Although I will also add code to query the DHCP server and obtain an IP address.)

I'm slowly starting to appreciate how good ArrayForth is.

To put things in perspective, the C source code from Wiznet for use with Arduino comes to 2711 lines. Admittedly much of this is comments. The code below for the same basic functions comes to 78 lines of which about 50% is comments.

I had a couple of actual hardware issues. When I plugged in the Wiz550io, the 3.3v rail sank to 2.9v. I quickly realised that it isn't properly regulated. So I wired an el-cheapo buck regulator board into the 5v rail to supply 3.3v and it's rock solid now.

Another issue was that (I think) I was clocking the Wiz too fast. I copied an SPI routine in the ArrayForth example code for reading and writing flash RAM and it was written to maximize data transfer speeds. I changed the delay factor from 20 to 1000 and transfers seem very stable now.

Another issue was that I wasn't watching very carefully when I wrote the routine to write to the IO pins on the GA144 and wasn't giving it enough settling time. As a result I was getting occasional glitches on the chip select line. There are lots of warnings in the user guide and I fell straight into the trap. Simple solution is to add a couple of no-ops after the write.

The code below uses two GA144 nodes: node 705 because it and node 008 are the only nodes with all the i/o pins needed for SPI interfacing and node 706 because I didn't want to burden 705 with time-wasting code which varies the clock speed plus it makes it easier to separate the work load. Node 706 pre-shifts the write bytes before sending them to node 705. Meanwhile 705 can be getting on with the task of changing i/o pins levels and pausing to allow them to settle without holding up node 706.

Eventually I will need to rewrite code in 706 to read and write bytes from other nodes via 'wires' (same as used in musicbox).

Open Workbench Logic Sniffer

I used this US$50 16-channel logic analyser (and software) to follow the SPI bus exchanges and it has got to be the best value for money in the electronics world. Not only does it capture logic levels upto 50Mhz but the software also includes protocol analysers, including an SPI analyser.

The screenshot (click image to enlarge) shows node 705 sending a 4-byte IP address (192.168.1.228) to the wiz for loading into the device's Source IP Address registers (0x0f-0x12). It then reads the registers back and stores them in the data stack of node 706.

Pinging the Wiz550io

As luck would have it, my home LAN uses 192.168.1.xxx subnet and the default startup IP for the Wiz is 192.168.1.2 and it's otherwise unused on my LAN. So I plugged the Wiz into it's socket on the EVB001, turned on the 3.3v supply and plugged an Ethernet cable into it from the router. Then I ping'ed the IP address and the Wiz responded. I unplugged the Wiz and pinged again and nothing answered (so I knew there wasn't another device on the LAN with the same IP :). Similarly after running the code below to change the IP address of the Wiz I was again able to ping it at 192.168.1.228 confirming the SPI upload of the IP address.

Next tasks

So now I need to add the DHCP client and a simple HTTP 'Hello World' server before tackling the huge task of writing an MP3 encoder.

The code

1018 list 
spi interface to wiz550io
705 node 0 org
start 00 left a! io b!
2 20 1000 dup -++ half
nxt @ push ex . nxt ;
done 09 -++ !b . . ;
!8 8obits drop ;
!24 @ 8obits 0f !8 @ !8 ;
addsel
 11 dup select !24 ;
bytout
 13 @ for @ !8 next done ;
byte
 17 dup dup or 7 push ibits ;
bytin 1a @ for byte ! next done ; 1e
  
start init regs a and b. Load stack with delay factor. Set pins 1(CLK) and 3(CS) high and 5(MOSI) low
nxt read address of next word to run from left port, push into return stack and ex to jump to word.
done set cs and clk high.
!8 use rom word 8obits to clock out 8 bits in reg t.
!24 clock out 16 bits in t, drop t then clock 8 bits out of next stack word.
addsel enable cs pin 3 then clock out 24 bits to select address in wiz. Also set r/w bit.
bytout read num of bytes to output from left port, then clock each set of bits out.
byte clock 8 bits into reg t.
bytin read num of bytes to input from left port, input each byte then send it to left port.

1020 list 
drive spi 705 iface
706 node 0 org
lsh for 2* unext ;
cmd push 9 lsh pop 1 lsh
06 @p ! . ' addsel ' ! ! ;
wbyt 09 @p ! ' bytout '
   
dup ! for 9 lsh ! next ;
rbyt 10 @p ! ' bytin '
   
dup ! for @ unext ;
start 14 left a!
begin
   
4 f cmd
   
228 1 168 192 3 wbyt
   
0 f cmd
   
3 rbyt
   
warm
end 28
lsh left shift
cmd left shift block select byte 10 bits, left shift address offset word 2 bits then pass s and t to node 705 to clock the 24 bits out to wiz.
wbyt write n bytes out using node 705 bytout word. shift each byte left by 10 bits prior to passing to bytout.
rbyt read n bytes into data stack.
start set a to left port, write IP address to wiz register then read IP address back in to leave it on stack.

1428 list ROM code
spi boot top/bot
4 kind aa reset host
---
 2a lit ; do, ce-, clk
--+
 2b lit ;
+--
 3a lit ;
+-+
 3b lit ;
-++
 2f lit ; target
a1 org 1388 load relay
c2 org
8obits
 dw-dw' 7 for leap obit
                2* *next ;
ibit
 c7 dw-dw'
  
 @b . -if drop - 2* ;
   then drop 2* - ;
half
 ca dwc-dw !b over
   for . . unext ;
select
 cc dw-dw -++ half
                --+ half ;
obit
 d0 dw-dw then
  
 -if +-- half
       +-+ half ; then
rbit
 d5 dw-dw --- half
              --+ half ;
18ibits
 d9 d-dw dup 17 for push
ibits
 begin rbit ibit - next ;
u2/ 2/ 1ffff and ; e1
a9 org
a9  warm await ;
aa 1430 load the rest
c1


1014 list 
install via async bootstream
empty compile
streamer load framer load
async frame ae fram ;
wizspi align create
708 705 to
-1 ,

wizspi course
2 fh load frame stream

serial load -canon
a-com sport ! a-bps bps ! !nam
talk send
2 706 hook panel


1016 list 
705 +node 705 /ram 0 /p
706 +node 706 /ram 14 /p
2 0 705 hook                                            

Saturday, June 28, 2014

An Internet music player pt3

The three subprojects mentioned in previous posting can in fact be regarded as one project, viz., connect the Wiz550io to the GA144 and output the stream of PCM codes as a webserver response. Conversion to streamed MP3 is icing on the cake. Booting from Flash is additional icing. Designing and building a PCB to hold the circuit will be the icing, cream and two cherries on top.

I've added two headers to the EVB001 prototyping area and wired them to the SPI connections of node 705. Thankfully realised that Wiz550io uses 3.3v levels but node 705 pins are at 1.8v levels. So I wired one of the spare level converters into the path.

Running Wiz550io from 3.3v required me to add a 3.3v source. Using my Dangerous Prototypes ATX Breakout Board to supply the 3.3v, I also realised I could add a jumper to the 5v supply connection and remove the 5v wall wart I had been using.

So now, using the examples in Ch 9 of ArrayForth Users' Guide, I can switch on the SPI pins DO, CLK and CS and see 3.3v levels on the corresponding pins of the socket for the Wiz (MOSI, SCLK and SCSn). On the EVB001 these pins are hidden behind some selection logic but there are some test points on the board to verify the correct responses.

Next step is to write some code to run the Wiz550io from node 705. Need word to get an IP address from DHCP server on network (overriding default IP). Need word to echo text back to client.

Sunday, June 15, 2014

Make a PCB

As I noted in a previous blog I followed a YouTube tutorial on how to create a PCB with KiCAD. Having got that far I decided to order the boards from DirtyPCBs. AU$14 for 10 boards including postage is hard to beat.

The boards arrived last Thursday and they are beautiful! And red! (I had expected green.)

So I figured that having come this far I might as well complete the exercise.

Being too lazy to shop around I decided to use DigiKey to fulfill the Bill of Materials (BOM). For 10 boards the total came to AU$52 (plus $16 for solder paste).

The whole point of the PCB design was to use SMD components so I've also ordered various tools to assist SMD soldering. On ebay I bought a pick and place tool, hot air gun and USB microscope.

I'm not sure whether a stencil is needed but I want to try this. For the size of the board in this instance and the size of the components I suspect squeezing the solder paste from the tube will be sufficient.

I've been contemplating a toaster oven converted to reflow oven but it doesn't make much sense at this stage.

An Internet music player pt 2

Although I only published pt 1 today, I actually wrote it a couple of weeks ago. In the meantime I have been attempting to move the music player app into the top right hand of the GA144 chip. Nothing I tried would make it work and when I took the app back to the original layout it wouldn't work either! Eventually I found my typing error. The good news is that I learned how to use the async loader so now can update apps in nodes almost instantaneously.

A couple of sub-projects have arisen. One is to down-sample the audio stream generated by the app. At the moment it is generating 18-bit PCM samples but for CD-quality sound I only need 16-bit samples. Down-sampling will also save in processing time.

Another sub-project is to output the PCM stream to a file on my laptop to verify the format. It should be a WAV-format stream but there will be some mystic hand-waving (pun!) to modify the stream into Wav format.

Another sub-project is the work out how to use the SPI Flash RAM to load the app code at startup and then get the Flash RAM out of the way so Wiz550io can use SPI bus.

Another sub-project is to write code to control Wiz550io from GA144 SPI node (node 705). As usual, C-code for Arduino seems incredibly verbose for what it has to do.


An Internet music player

Circuit Cellar is running a design challenge using the Wiz5500io module. I applied mainly to see if I could get one of the modules to try out. Deadline is 3rd August. I really haven't left enough time but I am trying to enhance a plucked string synthesiser app I wrote for the GA144 to output streamed MP3. This stream is then fed to the Wiz550io and voila! I have an Internet music player streaming beautiful meditation music.

It seemed quite simple at first. Then I looked at how one encodes MP3 frames and I'm starting to baulk.

The idea is to read 1152 16-bit, 44.1KHz PCM samples, split the samples into 32 frequency bands, run an FFT over the samples to work out which bands are dominant and use a psycho-acoustic model to select how much of each band to put into the output frame. Add in a Huffman encoder and we are left with a lot of code and a lot of RAM usage.

The question I can't answer yet is whether it is possible to fit it all into a GA144 alongside the synthesiser.

So far I've worked out that I need two set of samples, 1152 for the current frame and 1152 for the previous frame. This allows better prediction of band energies. As it happens 1152 is 18 * 64 which fits precisely into one row of the GA144.

So I have the synth taking 18 nodes (2 x 1/2 rows), and the samples taking 2 rows leaving 5 rows of nodes to encode the huge number of calculation constants and transforms. I need the SPI interface in node 705 to initially load the code from flash RAM and then to output the generated MP3 frames to the W5500. At this stage I don't need any additional RAM/ROM but I could throw the samples and the constants into external RAM if necessary.

I think the F18 nodes will be fast enough. This is only for audio, not video or radio.

I am attempting to port the LAME encoder, which is written in C, to ArrayForth.

Some subprojects I will need to implement on the GA144:

  1. Move synth nodes to top of chip (609-617, 709-717) and verify it works.
  2. Load 1152 samples out of synth into RAM of 18 nodes. Easy enough to verify in Sim. Model transport from MD5 hash encoder example.
  3. Translate all the required constants in LAME code into floating point equivalents and load them into nodes (how many?). Easy enough to use Perl for the calcs.
  4. Implement fft_long and fft_short and test. Will need to also tool up LAME code for single step debugging so can verify results
  5. Implement window type selection based on FFT analysis.
  6. Implement mdct*. This might need cosine calc/lookup as used in (co)sine synth from previous project.
  7. Implement Huffman encoder. Needs a table of lookup vals.
  8. Create output frame from encode data plus side info.
  9. Output frame to W5500 and verify Internet transmission. A stand-alone test could be to output the same frame which presumably would play the same note(s) repeatedly. Each frame represents 38ms of sound so it would be very short.

Porting the MusicBox app to GA144

Introduction

The pre-cursor to the GreenArrays  GA144 was the SeaForth S40C18. The SeaForth had 40 nodes whereas the GA144 has 144 but they have the same instruction set. Chuck Moore parted ways with Intellasys, the company which produced the SeaForth, a couple of years ago and the acrimonious lawsuit which resulted was settled last year.

One of the demo apps which was included in the VentureForth compiler kit was a musicbox app. It uses a synthesised plucked string algorithm to generate random but quite pleasant 'plucked string' music.

I've always liked the app so I decided for my own education to port it to the GA144 using ArrayForth.

The first thing I discovered was how great the divergence there has been between VentureForth, the version of Forth used on the SeaForth chip and ArrayForth, as used on the GA144. In addition ArrayForth is a closed universe. It's almost impossible to import any program files into aF. They have to be hand-typed, whereas vF is ANSI text based and I could use standard text editors such as Vim to manipulate the program code. However the need to understand each instruction meant hand-typing was a relatively small hurdle. The bigger task was to understand what the instruction meant in vF and replace it with the equivalent aF instruction. While the instruction set is one-to-one compatible, the compiler directives in vF are completely different and in some cases there is no equivalent. This caused me a few headaches.

The other handicap I faced is my lack of knowledge of Forth. In the end this wasn't a big problem because the code for MusicBox is designed for Forth chips like the SeaForth and GA144 and doesn't rely on a lot of standard Forth familiarity.

Overview of code

The code uses the best feature of the GA144, namely the ability to offload work onto adjacent nodes while continuing with another task. The "central" node, called the composer, decides which note to play next. It then relays this choice to one of six "plucked string" synthesis nodes which generate code using the Karplus-Strong algorithm. The resulting streams of note-generation code are fed to a moving average filter node which feeds the result to a 'pre-dac' node which converts the PCM music code to PWM code which is in turn fed to a node which controls one of the digital-analog converters (dac). The output of the dac is fed to headphones or a speaker.

Eighteen nodes are used. The composer node uses "random" input from an ADC to select the next note to play from a list of harmonically related notes. There are no "discordant" notes.

The output of composer is sent to a router node which keeps track of which nodes are busy synthesising notes and channels the next note to the next free node.

There are six "pluck" nodes. The KS algorithm uses a one-sample delay and so there are six "delay" nodes, one for each "pluck" node. Similarly the MA filter node requires a one-sample delay node.

The output of the MA filter node is passed to a "pre-dac" node which calculates the three parameters needed to drive the dac node. The calculations are time-expensive and are thus off-loaded to a separate node rather than attempting to run them in the 'dac' node.

Thus 1 composer, 1 router, 6 pluck, 1 filter, 7 delay, 1 predac, 1 dac = 18 nodes. Composer node has to be a node with analog input and obviously the dac node must have analog output. This constrains where in the GA144 the nodes can be. Thus I chose the following path:

717 (composer), 617 (router), 616, 615, 614, 613, 612, 611 (pluck), 610 (filter), 609 (predac), 709 (dac). Delay nodes are: 710, 711, 712, 713, 714, 715 and 716.

Composer

bc (bit count, by Michael Montvelishsky) The note to be played is chosen by counting bits in the number on top of the stack. That bitcount is used to index the table of frequencies at the beginning of the code, and that frequency is passed on to the router node, to be given to one of the voice nodes.

note Use note number to lookup frequency then send it on to the router. 400 is a constant used to determine the length of a rest, and therefore the tempo. Decrease constant to play faster.

play The actual note to play is derived by counting the bits in the number on top of the stack. If the new note, determined by counting the bits in 'new' is the same as the old then play a rest i.e. be quiet. Otherwise give the< new note number to 'note'. The note played is also left on the top of the stack to be compared with the next note.

piece Read a random number from the adc counter at 'data', "and" it with 511 to keep it reasonable (dac has only 512 levels). Store number in A register to be used as an increment to find the next note in the "piece".

compose Play 127 notes, beginning with 0 and incrementing the "note" number by the amount stored in the A register by piece. The actual note is determined by counting the bits in the number fed to play.
 
1204 list 
musicbox - plucked string synthesis
713 node 0 org
lookup table of frequency data
27400 , 27400 , 24500 , 21800 ,
19400 18300 , 16300 , 14500 ,
13700 , 12200 , 10900 , 9700 ,
9100 , 8100 , 7200 , 6800 ,
6100 , 5400 ,
bc 12 bitcount dup dup or - for
      
dup push zif drop pop - ;
      
then pop and next
note
 17 a push a! . 18 @ !b pop a!
rest
 400 ;
delay
 1b dup for
         
dup for unext dup or -
      
next
      
1f drop ;
play
 20 bc over over or if
         
drop dup push note
         drop pop ;
      
then rest drop drop ;
piece
 28 random adc
       data a! @ 1ff and a! ;
compose 2c a dup dup or 
notes 127 for
       
2f over play 30 push
        a . + pop
     
next drop drop ;
start 33 e000 !b down b! rest
      
begin piece compose end 3a

Router

ring The address following ring is a variable that holds the next address to
be executed as a coroutine in the list that follows the variable. When
used in voice and force, the effect is to cycle through the
numbers in the "tables", returning the next number each time voice or
force is executed.

+note  sends the note-on message on to the mixer chain, along with a
voice number and force number, telling the chain which node should
process this note and how loud it should be.

The main loop of the router first checks io register to see if the composer is
requesting attention. If so then a note is received from the composer
and passed on to the appropriate voice node. In either case a "play"
message is sent on to the next node in the mixer chain, to keep the
note samples going.


1216 list 
router 613 node 0 org
ring pop b! @b push ex pop !b ; 
voice ring
5 ,
r1 0 ex 1 ex 2 ex 3 ex 4 ex 5
   ex r1 ;
force ring
14 ,
f1 120 ex 100 ex 80 ex
    70 ex 60  ex 50 ex f1 ;
+note 4 voice @p ! ! ' a relay '
   
! @p ! . ' @p a! @p . '
   
' w lit ! ! @p ! @p a!
   
! force ! ;
start 2c up a! io down
   
begin
      
over b! @b 2* 2* 2* 2* -if
         
33 over a push a! @ pop
         a! +note
      
then @p ! dup . ' @p play '
           
or !
   
end 3a

Pluck

Pluck uses the Karplus-Strong algorithm, http://en.wikipedia.org/wiki/Karplus-Strong_string_synthesis
A better explanation is at music.columbia.edu - Start with a buffer full of random numbers which is equivalent to an energetic string pluck, read through the buffer using the values as sample values, average each value with the previous value and write it back to the buffer as well as forwarding the sample to the dac player. Over time the averaging is equivalent to a low-pass filter and will remove the higher frequencies until eventually the waveform will be flat i.e. the string has stopped vibrating.

1214 list 
pluck - karplus-strong string synthesis
0 org
pluck dup push . + 2/ pop a -if
         
drop 2/ 2/ ;
      
then push zif
swp
 05 over push push drop pop pop ;
      
then pop a! drop drop @p drop @p
!rnd
 dup !p ; 3ffff , rnd 0b -if
      
2* 2cd81 or @p
   
then 2* dup .. drop !rnd 1ff and dup ;
rwrw
 12 @p !b @b . ' !+ @ !b .. ' 1ff and ex
   
@p !b !b . ' @p .. ' ex
   
@p !b @b .. ' @ !b .. '
   
2/ 2/ 2/ 2/ 2/ 2/ 2/ 2/ 2/ 1ff and ex
   
8 for 2* unext
   
@p .. ' @p . + . ' !b !b ex rwrw ;
play 26 @p drop !p
mix
 27 dup @p + ;
0 ,
pop drop push -if
   
1ffff and over b! pop ex pluck ex push
then push over b! pop @p . +
w 33 1ffff ,
pop mix @p !b !b mxplay 36 ' @p play ' 37
                                              

A full listing is on GitHub: https://github.com/garyaj/musicbox