And the USA has again put a rover on Mars

We did it again!  Curiosity is on Mars! Here’s to all the men and women over at NASA Jet Propulsion Laboratory and any others who worked on the project.  Now the real work begins lol

http://mars.jpl.nasa.gov/msl-raw-images/proj/msl/redops/ods/surface/sol/00000/opgs/edr/fcam/FRA_397502305EDR_T0010000AUT_04096M_.JPG

– Teknoman117

Advertisements

C++ Plugins with Boost::Function on Linux

Over the past few weeks, one of the concepts that I’ve been experimenting with is plugin architecture.  The idea of having a core application which can be extended by shared objects without recompiling the core program.  Or, possibly, a way of defining more services in an application based around plugins.  What I’ve done so far hasn’t been much, but, I haven’t spent much time working on it.  When I started out on researching it, one of the things I wanted was to be able to was define a C++ object in a plugin and be able to instantiate that object in the main program.  So, this is what I have come up with.  There are three parts of this solution: the plugin class definition in plugin.hpp, the plugin in someplugin.cpp, and the loader code in loader.cpp.  I’ve also included a CMakeLists.txt file to compile it with cmake.

plugin.hpp

#ifndef _PLUGIN_HPP_
#define _PLUGIN_HPP_

#include <string>

namespace plugins
{
  class Plugin 
  {  
  public:
    virtual std::string toString() = 0;
  };
}

#endif

awesomeplugin.cpp

#include "plugin.hpp"

namespace plugins
{
  class AwesomePlugin : public Plugin
  {  
  public:
    // A function to do something, so we can demonstrate the plugin
    std::string toString()
    {
      return std::string("Coming from awesome plugin");
    }
  };
}

extern "C" 
{
  // Function to return an instance of a new AwesomePlugin object
  plugins::Plugin* construct()
  {
    return new plugins::AwesomePlugin();
  }
}

loader.cpp

#include <iostream>
#include <vector>
#include <dlfcn.h>
#include <boost/function.hpp>

#include "plugin.hpp"

typedef std::vector<std::string>             StringVector;
typedef boost::function<plugins::Plugin* ()> pluginConstructor;

int main (int argc, char** argv)
{
  // Assemble the names of plugins to load
  StringVector plugins;
  for(int i = 1; i < argc; i++)
  {
    plugins.push_back(argv[i]);
  }

  // Iterate through all the plugins and call construct and use an instance
  for(StringVector::iterator it = plugins.begin(); it != plugins.end(); it++)
  {
    // Alert that we are attempting to load a plugin
    std::cout << "Loading plugin \"" << *it << "\"" << it->c_str() << std::endl;

    // Load the plugin's .so file
    void *handle = NULL;
    if(!(handle = dlopen(it->c_str(), RTLD_LAZY)))
    {
      std::cerr << "Plugin: " << dlerror() << std::endl;
      continue;
    }
    dlerror();

    // Get the pluginConstructor function
    pluginConstructor construct = (plugins::Plugin* (*)(void)) dlsym(handle, "construct");
    char *error = NULL;
    if((error = dlerror()))
    {
      std::cerr << "Plugin: " << dlerror() << std::endl;
      dlclose(handle);
      continue;
    }

    // Construct a plugin
    plugins::Plugin *plugin = construct();
    std::cout << "[Plugin " << *it << "] " << plugin->toString() << std::endl;
    delete plugin;

    // Close the plugin
    dlclose(handle);
  }

  return 0;
}

CMakeLists.txt

# Project Stuff
cmake_minimum_required (VERSION 2.6)
project (PluginDemo)

# Default Options
add_definitions("-std=c++0x")

# Find Boost
find_package(Boost REQUIRED)
include_directories(${Boost_INCLUDE_DIRS})

# Pull in the project includes
include_directories(${PROJECT_SOURCE_DIR}/include)
set(LIBRARY_OUTPUT_PATH ${PROJECT_SOURCE_DIR}/lib)
set(EXECUTABLE_OUTPUT_PATH ${PROJECT_SOURCE_DIR}/bin)
set(LIBS ${LIBS} pthread boost_thread rt)

# Build the plugin experiment
add_executable(pluginloader src/loader.cpp)
target_link_libraries(pluginloader ${LIBS} dl)
add_library(awesomeplugin SHARED src/awesomeplugin.cpp)

Basically create a directory with the folders bin, lib, and src. Put loader.cpp, awesomeplugin.cpp, and plugin.hpp in src, and CMakeLists.txt in the directory. Open a terminal and run “cmake . && make”. Run the pluginloader program and pass it the path to the plugin’s .so in the lib folder. Here is the output from my computer.

nathaniel@XtremePC:~/Programming/Experimentation> cmake .
— The C compiler identification is GNU
— The CXX compiler identification is GNU
— Check for working C compiler: /usr/bin/gcc
— Check for working C compiler: /usr/bin/gcc — works
— Detecting C compiler ABI info
— Detecting C compiler ABI info – done
— Check for working CXX compiler: /usr/bin/c++
— Check for working CXX compiler: /usr/bin/c++ — works
— Detecting CXX compiler ABI info
— Detecting CXX compiler ABI info – done
— Boost version: 1.46.1
— Configuring done
— Generating done
— Build files have been written to: /home/nathaniel/Programming/Experimentation
nathaniel@XtremePC:~/Programming/Experimentation> make
Scanning dependencies of target awesomeplugin
[ 50%] Building CXX object CMakeFiles/awesomeplugin.dir/Plugins/awesomeplugin.cpp.o
Linking CXX shared library lib/libawesomeplugin.so
[ 50%] Built target awesomeplugin
Scanning dependencies of target pluginloader
[100%] Building CXX object CMakeFiles/pluginloader.dir/Plugins/loader.cpp.o
Linking CXX executable bin/pluginloader
[100%] Built target pluginloader
nathaniel@XtremePC:~/Programming/Experimentation> bin/pluginloader lib/libawesomeplugin.so
Loading plugin “lib/libawesomeplugin.so”
[Plugin lib/libawesomeplugin.so] Coming from awesome plugin
nathaniel@XtremePC:~/Programming/Experimentation>

– Teknoman117

Nvidia system with 3 monitors

Last year (2011), when I graduated high school, my school was going through a huge overhaul of the campus.  The old campus had been slowly been being destroyed as new buildings had been built.  That said, they also went though a shift in the technological resources.  They decided to write off or just plain junk a significant portion of the old computers (P4 era Dell Optiplex machines) and I happened to obtain two 17″ TFT panel LCD screens.  I bring them home thinking I’ll do something with them in the future.  I happened to be working with a upstart gaming studio, E1FTW Games (http://www.e1ftwgames.com/), that summer (i still am) and I had an iMac on my desk so I did nothing with the monitors.  After I left for college, I used the pair of them with an old eMachines computer that my family had long since forgotten about as my computer when I was at home because my primary (most awesome) computer I had brought with me to college, and resided in my dorm room.  When I finished my first year, I live (am still as of July, 2012) at my parent and I set my big computer back up.  I use a 1080p 23.5″ LCD tv as my primary monitor, but I seriously wanted to use the pair of monitors I had with my desktop.  Much to my dismay, Nvidia GPUs only support 2 monitors per chip, so, even though I had the three monitors on my desk, only the 23.5″ panel worked along with one of the smaller screens.  So, here I was trying to find the cheapest Nvidia gpu I could that would fit into a PCIe x1 slot.  Much to my surprise, they cost more than their PCIe x16 counterparts, something I regard as pretty damn stupid.  So I was searching for one that would fit in the PCI bus.  They were even more than the PCIe x1 cards.  It just wasn’t fair.  So I was lining up to buy a GeForce GT 430 for something like $80 that slotted in a PCI socket.  I was pretty bummed that this was the only solution, but then I had an idea.  PCIe is supposed to be failure tolerant.  If one of the channels goes dead, it just isolates the problem by ignoring the fact that it exists.  So I had a though – could I stick a PCIe x16 card in a PCIe x1 slot and operate a just a 1/16th the bandwith?  Sure enough, there were websites all over the internet that described cutting the end of the PCIe x1 slot off and placing in a PCIe x16 card.  I decided to try it out, and what do you know, it worked.  So here are some pictures.

All the screens loaded up onto my desk

My desktop. It doesn’t look like much, but its my trusty computer.

the eMachines computer that again I’m stripping of its GeForce 7300 gs graphics card

This is one of the most confused pieces of computing hardware I’ve ever owned. Its labeled as a GeForce 7200 GS, but its been identified as either that OR a GeForce 7300 SE (not a typo)

Targeted a PCIe x1 slot for chopping

Both cards, a GeForce GTX 560 2gb (the primary card) and the GeForce 7200/7300 gs installed

Viewing Steam and a test website under Chrome on the monitors plugged into the 7200 gs and running the Nvidia fluids demo on the primary monitor. Fun fact – the fluids demo runs fine on the old cards too, because it uses the GTX 560 to run the physics

Both GPUs identified in Furmark under Windows 7

World of Tanks up on the center monitor and stuff on the sides

All three monitors up and working under OpenSUSE 12.1 with the nvidia 295.59 drivers installed. This is under Xinerama, which I ended up disabling, see below

There was one unforeseen side effect of running the GPUs under Linux (my OS of choice).  I was trying to use Xinerama to make one contiguous display so I could do the awesome extended desktop thing, but alas, it was not to be, considering that I am using two widely varied cards.  The GeForce 7300 card is so old, it was available before Windows XP had any service packs.  It doesn’t even use shading processors.  It can only run one shader script at a time and has vertex, fragment, and geometry units straight on the card – its a DX9 GPU.  The primary card is a GeForce GTX 560.  A card with 8 times the amount of memory that runs 100s of times, 336 CUDA cores, and supports DX11 and OpenGL 4.x.  So compositing did not work and GL was disabled on the displays because it wasn’t compatible with main card, in turn because not all cards had GL, KDE wouldn’t run the effects manager.  This resulted in really slow window operations, the UI was so very laggy.  So I decided to give separate X screen a go.  It works flawlessly.  Windows may be locked to their respective screens, but its not at all bad.  Kwin places new windows in the screen where the mouse is when the application is launched.  Although, i do wish that when I want a new chromium window I could put it in another screen without having to run DISPLAY=”:0.2″ chromium from the console all the time when its already launched in another X screen.  I spend a lot of time in the console though, so its not really too bad.  Beats having only one monitor.  Since I chose to do it this way, OpenGL applications are supported in all the windows and they start by default unless instructed otherwise on the primary screen.  Fullscreen OpenGL applications on the two side monitors are unpredictable and unstable, but just fine on the center screen, driven my the massive GPU.  All in all, its an awesome setup and I love it. Linux had come a long way since its conception and now with Unity3D, a very popular game engine, officially supporting Linux and Autodesk releasing their 3D software for Linux (such as Maya), maybe Windows will start loosing its stranglehold on gaming.

– Teknoman117

AVR RTOS Update

I haven’t forgot about my little rtos project, although its moving towards not really being an RTOS.  The goal is to write a task manager for the AVR and as an extent, the Arduino.  As I don’t have an Arduino Mega, or any board with an AVR with more than 64 K words of flash, such as an ATmega2560, I can not write the task switcher for that board, at least not properly test it because the pointers are a bit larger for flash.  So the function pointer sizes change.  So this project will support AVRs that have 16 bit program counters.

As I mentioned earlier, this project is moving away from being an RTOS to more of a process manager.  The task switcher will still consider time as a factor in the decision to run a thread, but will have extended set of run conditions.  I am going to add the concept of a lock to the task switcher and remove the concept of priority  This is so the AVR cpu does not have to waste precious clock cycles performing the context switch to a task only to have yield called again.  The locks are going to be contained in a linked list, and when the list is empty, the task can run.  I am not removing the concept of the “next run time” because I believe that most thread locks are going to be due to sleeping, such as a PID algorithm.  It needs to run at 25 Hz, and doesn’t need locks calculated until its time to run again.  A lock could be used for example, with a UART reader.  It should be locked until data is available.  This generates some new definitions in the code along with the discovery that in the AVR, malloc is not an expensive operation (~100 cycles).

struct _avr_task_lock
{
    struct _avr_task_lock *next_lock;
    void *lock_data;
    char (*lock_function)(void *); 
}

struct _avr_task_entry
{
    uint32_t next_run_time;
    uint16_t *stackptr;               // use cpiw to check if its equal to zero, if so, its invalid
    struct _avr_task_lock *lock_list;
}

The _avr_task_lock structure defines the lock object. It contains a pointer to the next object, a pointer to some data, and a pointer to a function which is used to figure out if that lock should expire because of the data pointer to by lock_data.  The _avr_task_entry structure defines the task entry to the AVR.  It contains the next time it should be run (locks are not crawled unless it should be allowed to run again), the current stack pointer, and a pointer to the first lock object.  I think the manager should use the X and Z pointer to store last and current pointers when everything is ported to assembly.  The only reason I would choose to store a next run time value is because if the thread needs not to be run until a time, why waste precious CPU cycles on something dependent on the clock.

A pure assembly function will be added to wrap a function that is desired to be executed.  This provides a wrapper so that when the thread function potentially returns, it can catch that and not blow up like the current implementation.  Basically, when the task adder pushes an executable thread, it stores a function pointer to the desired function as a parameter to the beginning of the wrapper function.  This wrapper function, when switched to, will call the function that is the thread, and if the thread returns, it invalidates the thread’s entry and performs a context switch, never to be executed again.

Eventually, when I get around to it, I’ll make a C++ extension to this, for the Arduino boards, or just people who use avr-g++.  Personally, I shy away from C++ in resource constrained environment, but hell, to each his own.

– Teknoman117

Simple AVR RTOS – C version

And now, less than 24 hours later, I have a version compatible with AVR-GCC!  Its actually the first time i’ve ever used C and assembly together in one application.  Its comes with a pretty simple application I threw together as a demo.  Blinking lights again, and its actually the same thing as the assembly version. Its a work in progress – don’t use it for anything important right now. I’m working on improving the code so its more stable and adding more functionality. As of now it has the ability to start up and have multiple tasks scheduled to run at regular intervals. An attributes object is passed to each so potentially you could have one function that does a couple of different things based on whats passed to it. Tasks have the ability to yield their execution when waiting, but its an unintelligent yield. It basically gives up all rights to execute until the another of its time periods have elapsed. Each task gets its own stack, the pointer you pass to it is the END!! of the stack. Its also pretty small. Each entry in the tasks table is 12 bytes (two byte stack pointer, four byte “next run” time, four byte “update interval”, one byte priority, and one byte to identify whether its a valid sport) and the attributes object is 6 bytes (one byte id, one byte priority, and four byte update interval). So each task costs 18 bytes to register + a stack. The stack has to be larger than 35 bytes, because thats how much is stored in a context switch. All 32 registers, the SREG, and the return address.  Here is the example main file, check out githib for all the code (link at bottom)

/*
 *  main.c
 *  Simple testing program (blinks lights) for AVR-RTOS
 *
 *  Copyright (C) 2012 Nathaniel Lewis
 *
 *  This program is free software: you can redistribute it and/or modify
 *  it under the terms of the GNU General Public License as published by
 *  the Free Software Foundation, either version 3 of the License, or
 *  (at your option) any later version.
 *
 *  This program is distributed in the hope that it will be useful,
 *  but WITHOUT ANY WARRANTY; without even the implied warranty of
 *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 *  GNU General Public License for more details.
 *
 *  You should have received a copy of the GNU General Public License
 *  along with this program.  If not, see .
 */ 

#include <avr/io.h>
#include <avr/interrupt.h>

#include "avr-rtos.h"

struct _avr_rtos_task_attr led_1_attr;
struct _avr_rtos_task_attr led_2_attr;
struct _avr_rtos_task_attr led_3_attr;

uint8_t                    led_1_stack[128];
uint8_t                    led_2_stack[128];
uint8_t                    led_3_stack[128];

void led_task(struct _avr_rtos_task_attr *attr)
{
    // To demonstate the passing of operand to the tasks, use the thread id - 1 as the pin to change
    uint8_t pin = attr->id - 1;
    DDRB |= (1 << pin);
    while(1)
    {
        PORTB |= (1 << pin);
        avr_rtos_task_yield();
        PORTB &= ~(1 << pin);
        avr_rtos_task_yield();
    }
}

int main(void)
{
    // Make ready the RTOS!
    avr_rtos_init();

    // We will be starting three tasks, all use the same function - although they have independent stacks and 
    // take their ID - 1 as the pin to toggle

    // LED 1 Task
    led_1_attr.priority = 1;
    led_1_attr.update_interval = (1000 * 8);  // Execute every 1000 ms
    avr_rtos_add_task(&led_task, &led_1_stack[127], &led_1_attr);

    // LED 2 Task
    led_2_attr.priority = 2;
    led_2_attr.update_interval = (500 * 8);  // Execute every 500 ms
    avr_rtos_add_task(&led_task, &led_2_stack[127], &led_2_attr);

    // LED 3 Task
    led_3_attr.priority = 3;
    led_3_attr.update_interval = (250 * 8);  // Execute every 250 ms
    avr_rtos_add_task(&led_task, &led_3_stack[127], &led_3_attr);

    // Set up the timer for interrupts (this program was written to run at 11.0592 MHz.  Finagle with
    // the prescaler settings and the compare match register.  The goal is to have a compare match 8000 times 
    // per second
    //
    // interrupts per second = F_CPU / (OCR0A * prescaler)
    //

    TIMSK0 = _BV(OCIE0A);
    TCCR0A = _BV(WGM01);
    OCR0A = 0xAD;
    TCCR0B = _BV(CS01);
    sei();

    while (1)
    {
        // Since this is the thread when there is nothing to do, and we don't have anything to do,
        // just call thread yield.  Though we could do something useful....
        //avr_rtos_task_yield();    // cause a lock up, check in simulator later....
    }

    // return control to libgcc.S
    return 0;
}

ISR(TIMER0_COMPA_vect)
{
    // Only call the context switcher every 8'th interrupt (1 KHz)
    if(!(avr_rtos_tick() % 8))
        avr_rtos_isr();
}

I’ve released this under the GPL v3 license in hopes that someone finds it useful eventually. I plan on adding more functionality like mutexes, removing tasks, buffers, don’t execute again under such a condition is true, C++ support, etc. C++ support will be so that it supports the Arduino.  I plan on making the source doxygen compatible and including a Doxyfile at some point.  You can check out the whole code on my github AVR repository at https://github.com/Teknoman117/avr-libraries/tree/master/avr-simple-rtos

Simple AVR RTOS

I’ve always wondered what was under the hood of all of those RTOS’es out there.  So I decided to throw one together myself.  Its a mess and you need to use avrasm2.exe from AVR Studio 6 to compile it.  Its written for the ATmega644 running at 11.0592 MHz.  The code blinks three LEDs connected to PORTB0, PORTB1, and PORTB2.  Each successive pin blinks twice as fast.  Each LED has its own thread to control it that use a little RTOS scheduler to run.  The original main point is used as a catchall when there is nothing else to run.  So, here is the code.  I hope to clean it up a bit tomorrow and make a version compatible with AVR-GCC and C functions

/*
 * avr_rtos.asm
 *
 *  Created: 6/10/2012 12:34:32 PM
 *   Author: Nathaniel Lewis
 */ 

 ; Variables
 .equ rtos_table      = 0x0100   ; (end = 0x012F) Beginning of the RTOS table of 48 bytes for 4 entries
 .equ system_timer    = 0x0130   ; (end = 0x0133) System clock
 .equ running_task    = 0x0134   ; (end = 0x0134) Current Running Task

 .equ task1_stack     = 0x0136   ; (end = 0x0235) Task 1 stack
 .equ task1_stack_end = 0x0235
 .equ task2_stack     = 0x0236   ; (end = 0x0335) Task 2 stack
 .equ task2_stack_end = 0x0335
 .equ task3_stack     = 0x0336   ; (end = 0x0435) Task 3 stack
 .equ task3_stack_end = 0x0435

 .equ num_of_tasks = 0x04
 .equ num_of_tasks_mask = num_of_tasks - 1
 .equ task_page_size = 0x0c

 ; Macro to increment the system counter (counter stored in r24:r25)
 .macro INCREMENT_COUNTER
    ldi ZL, low(system_timer) ; Load in the base address of the system timer
	ldi ZH, high(system_timer)
	ld  r24, Z                  ; Load the system clock value from ram
	ldd r25, Z+1
	ldd r0, Z+2
	ldd r1, Z+3
	ldi r16, 0x01               ; Add one to it
	ldi r17, 0x00
	add r24, r16
	adc r25, r17
	adc r0, r17
	adc r1, r17
	st  Z,   r24
	std Z+1, r25
	std Z+2, r0
	std Z+3, r1
 .endmacro

 ; A macro to push all of the registers
 .macro PUSH_ALL_REGS
	push r0
	push r1
	push r2
	push r3
	push r4
	push r5
	push r6
	push r7
	push r8
	push r9
	push r10
	push r11
	push r12
	push r13
	push r14
	push r15
	push r16
	push r17
	push r18
	push r19
	push r20
	push r21
	push r22
	push r23
	push r24
	push r25
	push r26
	push r27
	push r28
	push r29
	push r30
	push r31
	in r31, SREG
	push r31
.endmacro

 ; A macro to pop all of the registers
 .macro POP_ALL_REGS
	pop r31
	out SREG, r31
	pop r31
	pop r30
	pop r29
	pop r28
	pop r27
	pop r26
	pop r25
	pop r24
	pop r23
	pop r22
	pop r21
	pop r20
	pop r19
	pop r18
	pop r17
	pop r16
	pop r15
	pop r14
	pop r13
	pop r12
	pop r11
	pop r10
	pop r9
	pop r8
	pop r7
	pop r6
	pop r5
	pop r4
	pop r3
	pop r2
	pop r1
	pop r0
 .endmacro

 ; Reset Vector
 .org 0x00                    ; Vector 0x0000 has to point to the beginning of the program
 rjmp setup                    

 ; Timer CTC OCR0A Vector
 .org OC0Aaddr                
 rjmp timer0ovf

 ; Program Start
 .org 0x38                       ; This is the first word not in the iVector map
 setup:                          ; Set up all the hardware
	; set up some ports
	ldi r16, 0xff
	out DDRB, r16
	;ldi r16, 0x00
	out PORTB, r16

    rcall init_rtos              ; Initalize the rtos stack
	sei                          ; Enable interrupts

	; Add task 1
	ldi r24, low(task1)          ; Load the function pointer to task1 into r24:r25
	ldi r25, high(task1)
	ldi XL, low(task1_stack_end) ; Load the high end of the stack pointer of task 1 -> X
	ldi XH, high(task1_stack_end)
	ldi r16, 0x01                ; Task 1
	ldi r17, 0x01                ; Low Priority
	ldi r18, 0x00                ; Request to run every 1024 ms (0x2000 * 1/8 ms)
	ldi r19, 0x20	
	ldi r20, 0x00
	ldi r21, 0x00		     
	rcall add_task               ; Add the task

	; Add task 2
	ldi r24, low(task2)          ; Load the function pointer to task2 into r24:r25
	ldi r25, high(task2)
	ldi XL, low(task2_stack_end) ; Load the high end of the stack pointer of task 2 -> X
	ldi XH, high(task2_stack_end)
	ldi r16, 0x02                ; Task 2
	ldi r17, 0x02                ; task 1 Priority + 1
	ldi r18, 0x00                ; Request to run every 512 ms (0x1000 * 1/8 ms)
	ldi r19, 0x10	
	ldi r20, 0x00
	ldi r21, 0x00		     
	rcall add_task               ; Add the task

	; Add task 3
	ldi r24, low(task3)          ; Load the function pointer to task3 into r24:r25
	ldi r25, high(task3)
	ldi XL, low(task3_stack_end) ; Load the high end of the stack pointer of task 3 -> X
	ldi XH, high(task3_stack_end)
	ldi r16, 0x03                ; Task 3
	ldi r17, 0x03                ; task 2 Priority + 1
	ldi r18, 0x00                ; Request to run every 256 ms (0x0800 * 1/8 ms)
	ldi r19, 0x08	
	ldi r20, 0x00
	ldi r21, 0x00		     
	rcall add_task               ; Add the task

	; Start the timer, and therefore, the RTOS
	ldi r16, 0xAD
	out OCR0A, r16
	ldi r16, 0x02                ; Configure the timer to have normal operation, and 1/64 prescaler
	out TCCR0A, r16
	sts TIMSK0, r16              ; Enable the overflow interrupt
	out TCCR0B, r16              ; Start the timer
 main:
    nop
	rjmp main

 task1:
	cbi PORTB, 0
	rcall task_yield    ; Wait for next thread cycle
	ldi r16, 0xff       ; Turn light off
	sbi PORTB, 0
	rcall task_yield    ; Wait for next thread cycle
	rjmp task1

 task2:
	cbi PORTB, 1
	rcall task_yield    ; Wait for next thread cycle
	ldi r16, 0xff       ; Turn light off
	sbi PORTB, 1
	rcall task_yield    ; Wait for next thread cycle
	rjmp task2

 task3:
	cbi PORTB, 2
	rcall task_yield    ; Wait for next thread cycle
	ldi r16, 0xff       ; Turn light off
	sbi PORTB, 2
	rcall task_yield    ; Wait for next thread cycle
	rjmp task3

 ; ISR for timer0 ocr0a ctc.  Selects a task to run
 timer0ovf:
	; First increment the system clock
	INCREMENT_COUNTER
	andi r24, 0x07             ; AND r24 (timer low byte) with 0x07
	tst r24                    ; if r24 is zero (b's 2,1,0 = 0), its divisible by eight
	breq context_switch        ; and if it is, do the context switch
	reti                       ; if no, return from the ISR

context_switch:
	; Push the current context
	PUSH_ALL_REGS

	; Look up the base address of this task's entry
	ldi r17, task_page_size    ; set how many bytes per entry
	ldi ZL, low(running_task)  ; Pointer to current task
	ldi ZH, high(running_task)
	ld r16, Z                  ; get running task
	ldi YL, low(rtos_table)    ; Get pointer to rtos_table
	ldi YH, high(rtos_table)   
	mul r16, r17               ; multiply running task by entry size to get pointer into table
	add YL, r0                 ; add the pointer to the base to get the table entry base address
	adc YH, r1                 

	; Store the stack pointer entry in the table
	in r0, SPL
	in r1, SPH
	st Y, r0
	std Y+1, r1

    ; Select the next task to run
	rcall select_next_task

	; Store the next running task id
	ldi ZL, low(running_task)  ; Pointer to current task
	ldi ZH, high(running_task)
	st Z, r16 

	; Load the next task stack pointer
	ldi YL, low(rtos_table)    ; Get pointer to rtos_table
	ldi YH, high(rtos_table)   
	ldi r17, task_page_size    ; set how many bytes per entry
	mul r16, r17               ; multiply selected task by entry size to get pointer into table
	add YL, r0                 ; add the pointer to the base to get the table entry base address
	adc YH, r1 
	ld  r0, Y                  ; Load the stack pointer
	ldd r1, Y+1
	out SPL, r0                ; Output the stack pointer
	out SPH, r1

	; Pop all the registers
	POP_ALL_REGS
	reti                      ; Return from interrupt

 ; Subroutine to yeild the current thread of execution until the next cycle
 task_yield:
 	; Push the current context
	PUSH_ALL_REGS

	; Look up the base address of this task's entry
	ldi r17, task_page_size    ; set how many bytes per entry
	ldi ZL, low(running_task)  ; Pointer to current task
	ldi ZH, high(running_task)
	ld r16, Z                  ; get running task
	ldi YL, low(rtos_table)    ; Get pointer to rtos_table
	ldi YH, high(rtos_table)   
	mul r16, r17               ; multiply running task by entry size to get pointer into table
	add YL, r0                 ; add the pointer to the base to get the table entry base address
	adc YH, r1                 

	; Store the stack pointer entry in the table
	in r0, SPL
	in r1, SPH
	st Y, r0
	std Y+1, r1

    ; Select the next task to run
	rcall select_next_task

	; Store the next running task id
	ldi ZL, low(running_task)  ; Pointer to current task
	ldi ZH, high(running_task)
	st Z, r16 

	; Load the next task stack pointer
	ldi YL, low(rtos_table)    ; Get pointer to rtos_table
	ldi YH, high(rtos_table)   
	ldi r17, task_page_size    ; set how many bytes per entry
	mul r16, r17               ; multiply selected task by entry size to get pointer into table
	add YL, r0                 ; add the pointer to the base to get the table entry base address
	adc YH, r1 
	ld  r0, Y                  ; Load the stack pointer
	ldd r1, Y+1
	out SPL, r0                ; Output the stack pointer
	out SPH, r1

	; Pop all the registers
	POP_ALL_REGS
	sei                        ; Could have possibly been cleared
	ret 

 ; Subroutine to select the next task to run
 select_next_task:
 	; Fetch the current system clock time
	ldi ZL, low(system_timer) ; Load in the base address of the system timer
	ldi ZH, high(system_timer)
	ld  r0, Z                  ; Load the system clock value from ram
	ldd r1, Z+1
	ldd r2, Z+2
	ldd r3, Z+3

	; Skip run time update if we are main (test r16 for being zero)
	tst r16
	breq skip_update

	; Update the next run time (next run time = system time + update interval)
	ldd r4, Y+6                ; Load the update interval
	ldd r5, Y+7
	ldd r6, Y+8
	ldd r7, Y+9
	add r4, r0                 ; Add the two
	adc r5, r1
	adc r6, r2
	adc r7, r3
	std Y+2, r4                ; Store it back in ram
	std Y+3, r5
	std Y+4, r6
	std Y+5, r7
 skip_update:
	; Select the next task (highest priority task that wants to run takes, well, priority)
	ldi r16, 0x00               ; Main (the catch-all) catches when its not time to run another
	ldi r17, 0x00               ; Main has zero priority
	ldi r18, 0x00               ; The counter to look at tasks 
	ldi YL, low(rtos_table)     ; Get pointer to rtos_table
	ldi YH, high(rtos_table) 
 select_task:
	inc r18                     ; Increment to next task
	andi r18, num_of_tasks_mask ; Mask bit that are greater than the task count
	tst r18                     ; Test r18 for zero - means we've looped back to main 
	breq task_selected          ; If we've looped back, load the task
	adiw YL, task_page_size     ; Add the page size to the pointer to get to the desired pointer
	ldd r19, Y+11               ; Get if the task is valid
	tst r19
	breq select_task            ; If the task is invalid, skip it, examine another task
	ldd r4, Y+2                 ; Load the desired run time
	ldd r5, Y+3
	ldd r6, Y+4
	ldd r7, Y+5
 cp_byte3:
	cp r3, r7                   
	brmi select_task            ; its still in the future, try a different task
	breq cp_byte2               ; equal, have to check the next byte
	brge can_run                ; its greater than, so run it
 cp_byte2:
	cp r2, r6               
	brmi select_task           
	breq cp_byte1
	brge can_run           
 cp_byte1:
	cp r1, r5            
	brmi select_task              
	breq cp_byte0
	brge can_run           
 cp_byte0:
	cp r0, r4              
	brmi select_task            
 can_run:                       ; So now its time to run it, make sure its priority is above the current 
	ldd r20, Y+10               ; Load the priority
	cp r20, r17                 ; Compare with the existing
	brlt select_task            ; Not high enough, try another task
	mov r16, r18                ; Store task id and priority respectively
	mov r17, r4
	rjmp select_task            ; Examine another task
 task_selected:
    ret

 ; Subroutine that initializes the rtos stack
 init_rtos:
 	ldi YL, low(rtos_table)    ; Get pointer to rtos_table
	ldi YH, high(rtos_table)   

	; Configure the main entry manually
	ldi r16, 0x00    
	st  Y,   r16               ; Zero out the stack pointer
	std Y+1, r16
	std Y+2, r16               ; Zero out next run time
	std Y+3, r16
	std Y+4, r16
	std Y+5, r16
	std Y+6, r16               ; Zero out the update count
	std Y+7, r16 
	std Y+8, r16 
	std Y+9, r16 
	std Y+10, r16               ; Priority = 0 (its a catch wasted time function)
	ldi r17, 0x01
	std Y+11, r17               ; It is a valid task

	; Set the default entries in the rtos tables
	ldi r18, 0x00              ; entry counter
 next_entry:
    inc r18                    ; increment the entry id
    cpi r18, num_of_tasks      ; compare it to the max count
	breq entries_complete      ; if its equal to or greater than (oh shit!), exit 
	adiw YL, task_page_size    ; Add the entry size to the base pointer to get next entry
	st  Y,   r16               ; Zero out the stack pointer
	std Y+1, r16
	std Y+2, r16               ; Zero out next run time
	std Y+3, r16
	std Y+4, r16
	std Y+5, r16
	std Y+6, r16               ; Zero out the update count
	std Y+7, r16 
	std Y+8, r16 
	std Y+9, r16 
	std Y+10, r16               ; Priority = 0 (its uninitalized)
	std Y+11, r16               ; Its uninitalized, its invalid
	rjmp next_entry
 entries_complete:
	ldi YL, low(system_timer)  ; Load address of the system clock
	ldi YH, high(system_timer)
	st  Y,   r16               ; Store the empty system timer
	std Y+1, r16
	std Y+2, r16
	std Y+3, r16
	ldi YL, low(running_task)  ; Load address of the system clock
	ldi YH, high(running_task)
	st Y, r16                  ; Store running task = 0
    ret

 ; Subroutine to add a task (r24:25 is function ptr, X pointer is stack pointer, r16 is id, r17 is priority, r18:r21)
 add_task:
	; Start with pushing stack data
	in r0, SPL           ; Back up our stack pointer
	in r1, SPH 
	out SPL, XL          ; Temporarily shift the stack to the new task's
	out SPH, XH
	push r24             ; Push the return address
	push r25
	PUSH_ALL_REGS        ; Push the registers to the stack
	in XL, SPL           ; Get the new stack pointer
	in XH, SPH            
	out SPL, r0          ; Restore our stack
	out SPH, r1

	; Make the necessary table entries
	ldi r22, task_page_size
    ldi YL,  low(rtos_table)   ; Get pointer to rtos_table
	ldi YH,  high(rtos_table)   
	ldi ZL,  low(system_timer) ; Load in the base address of the system timer
	ldi ZH,  high(system_timer)
	mul r16, r22               ; multiply selected task by entry size to get pointer into table
	add YL, r0                 ; add the pointer to the base to get the table entry base address
	adc YH, r1 
	st  Y,   XL                ; Store the stack pointer
	std Y+1, XH
	ldi r22, 0x00 
	ld  r0, Z                  ; Load the system clock value from ram
	ldd r1, Z+1
	ldd r2, Z+2
	ldd r3, Z+3
	add r0, r18                ; Add it to the update interval
	adc r1, r19
	adc r2, r20
	adc r3, r21
	std Y+2, r0               ; Store a run time of update interval + system time
	std Y+3, r1
	std Y+4, r2
	std Y+5, r3
	std Y+6, r18               ; Store the update interval
	std Y+7, r19
	std Y+8, r20
	std Y+9, r21
	ldi r22, 0x01
	std Y+10, r17              ; Store the priority
	std Y+11, r22              ; Store that the task is valid
 done_add_task:
	ret

Hacking new MCUs into Arduino IDE

Its probably safe to assume that most of you who end of reading this have heard about the Arduino.   For those of you who haven’t heard of it, they are the latest microcontroller craze and they are selling by the tens of thousands.  Basically its a development board for a few lucky Atmel AVRs and it has a standard expansion connector set to adapt with numerous peripheral boards.  Personally, I’m not too much of a fan of them, for most of my projects I don’t need a board with a boot loader.  I own an AVR ISP mk.ii programmer and I don’t need to spend $35+ on a board with a $2 ATmega328 when I can by an ATmega644 or a larger chip for $4 and build my own dev board for another few dollars or so.  Although, I do understand that they are nice for prototyping and what I’m doing is more of an end-product type thing.

The robot I built with my friends from the UC Merced Robotics Society for Robomagellan in the spring 2012 semester used a pair of Arduinos and I wanted to use the code I developed there with my pre-existing software framework for the AVRs.  Since I don’t have any arduinos hanging around, I needed to either port the code or just make my own arduino.  My favorite AVR is the Atmega644, it has a ton of I/O and the most flash of any DIP variant AVR.  I came across this web page on using a chip from the most popular of the arduinos in its bare chip form to run a christmas frame.  It goes over how to add a definition to the  arduino IDE hardware list.  As I was reading through it, I noticed a few places where I could set different MCUs, frequencies, etc.  So I decided to try it out.

There are a multitude of steps in adding a whole new MCU to the arduino environment, these are assuming you are adding a new class of chip (such as the ATmega644)

  • Create a new “variant.”  This defines how the arduino IDE’s digital pins and analog pins map to the physical pins on the device.  This file also covers the pwm and pcint pins as well
  • Modify “wiring_private.h”
  • Add a new hardware entry

The arduino ide manages different devices in what they seem to call variants.  Each variant is a definition of the chip that you are using.  It defines how pins map to internal components of the AVR.  The first step is defining how many analog and digital pins exists on the device.  The rest of the file is mapping arduino digital pin numbers to registers in the device.  It also covers how these pins translate to their corresponding PCINT pins and which pins have PWM capabilities and what timer drives it.  You can get this in my github repository.

The next step is to modify the wiring_private.h header and add the new mcu’s external interrupt count.  You can see where I added the ATmega644 on line 57 of wiring_private.h

The last step is to add an entry for the board you are using.  The board I used is rather old.  It was designed by Wrighthobbies about 7 years ago.  It was literally not much more than a board that broke out the pins of an ATmega32 and provided a stk200 compatible programming header, a reset button, two 5V regulators, an optional oscillator, and some power rails.  The ATmega644 and its variants are pin for pin compatible with the ATmega32, so I just replaced the chips.  Although, you can add a comment to your order at Wrighthobbies and Eddy will just ship an ATmega644 with it.

The hardware testing of the ATmega644 arduino mod with it hooked up to my avrispmkii and an LED

We need to add an entry into “Arduino.app/Contents/Resources/Java/hardware/arduino/boards.txt” to reflect the devboard’s settings

devboard_m644.name = Wright Hobbies Devboard w/ ATmega644 and ext 11.0592 MHz osc

devboard_m644.upload.protocol=stk500
devboard_m644.upload.maximum_size=65536
devboard_m644.upload.using=avrispmkii

devboard_m644.bootloader.low_fuses=0xCE
devboard_m644.bootloader.high_fuses=0xDD
devboard_m644.bootloader.extended_fuses=0×FF

devboard_m644.build.mcu=atmega644
devboard_m644.build.f_cpu=11059200UL
devboard_m644.build.core=Arduino
devboard_m644.build.variant=atmega644
  • <board name>.name – provides the textual name of your board that you are presented in the Arduino IDE.
  • <board name>.upload.protocol – the protocol which your programmer uses.  In the case of the AVR ISP mk.ii this is stk500
  • <board name>.upload.maximum_size – the size in bytes of the flash ram on your AVR
  • <board name>.upload.using – the programmer which you want to use with your chip
  • <board name>.bootloader.low_fuses – the low fuse byte to set in your chip (see http://www.engbedded.com/fusecalc)
  • <board name>.bootloader.high_fuses – the high fuse byte to set in your chip (see http://www.engbedded.com/fusecalc)
  • <board name>.bootloader.extended_fuses – the extended fuse byte to set in your chip (see http://www.engbedded.com/fusecalc)
  • <board name>.build.mcu – the MCU you are adding
  • <board name>.build.f_cpu – the frequency it operates at
  • <board name>.build.core – currently, unless you want to write your own core library, set it to arduino
  • <board name>.build.variant – set to the variant of the chip you created earlier

Here is an example of it in the editor

See the board name in the bottom right corner

There is one caveat when doing this though, you have to use the “Upload with Programmer” option instead of just the upload button.  Here are two YouTube videos of it working and all of the sources for this are available on GitHub in my repo – http://github.com/Teknoman117/avr-libraries

Blinking LED

Ramping LED