Archive for April, 2007

Automation Controller Architecture

Add comment April 19th, 2007

I have been trying to decide on a good way to build an automation controller. Primarily I’m trying to decide on the overall architecture. The controller will be written for Linux. I’ll try to keep it portable but it’ll be written for Linux. There are several ways to go about this but first a list of requirements.

All manner of data types should be supported. This would include arrays and custom data types. There would have to be some kind of base data type list. BOOL, INT, UINT, REAL etc. Then the system would have to deal with custom data types like timers, counters, alarms etc. as well as arrays of all of the above.

The data tables should be completely dynamic. A data point should be capable of being added and removed from the system at any time after the system is running. There may have to be some module ownership issues to contend with, but for the system to be truly useful it will have to be capable of staying online and the data table changing.

Modules should be loadable during run time. Any module of any type (even those that didn’t exist when the main controller was compiled) should be loadable into the system and also removable. The rest of the system should be immune to this.

The system should be capable of being redundant. This is probably far more difficult than I think it will be, but it’s essential. At the very least the ability to make the system redundant should be considered in the initial design because this will be very difficult to add as an afterthought later. The question is whether or not the system itself is redundant or should the modules have redundancy built into them. It may also depend on what is being duplicated, whether it’s a controller, and I/O module or an HMI.

The system needs to be efficient. I hate writing code that wastes memory or clock cycles for no good reason. This system needs to be useful on an old 486 or an embedded system. Not all the modules will make sense on these types of hardware (a GUI HMI for example) but the core system needs to be tight.

It should be very easy to program a module. Just a few pages of documentation and about half a dozen function calls should be all it takes to write a functional module. Obviously there will be advanced features that will require more knowledge but I’ve found that if the initial entry into a new system is difficult then few enter. Make it easy to build the first module and then slowly learn to improve it.

There should be some level of network transparency. A big question is whether the networking is done at the system level or at the module level. I’m leaning toward the module level because I’d like the core system to be as simple as humanly possible. Put the features into the modules. This may make redundancy a little more difficult however so I’ll have to put some more thought into it. This is starting to look more like a DCS than a PLC.

It would make sense to include some method of messaging between modules. Other attempts at Linux based automation controllers have a central data repository and that is the only method for modules to communicate to each other. I’d like to see a message based architecture along with some central shared memory. There are all kinds of ways to share the data. Some are more efficient than others. I don’t like the way MatPLC handles it. They duplicate the entire memory map for each module. I don’t want to do this. I can see some small duplication for modules that need to be able to manipulate fairly large amounts of memory in a small amount of time, but the module should have control of that. A ladder program module for instance might allocate it’s input table, make it’s computations then write it’s output table.

Now for the different architectures.

The first that comes to mind is a simple daemon process that controls all the data and modules. I doubt that a single daemon process would make much sense. At some point libraries would have to be built to ease the task of module building anyway so at the very least a library should exist from the beginning.

A library brings something else to the table, and that is the ability for some of the core to reside in the individual modules process space. This could help with efficiency. That brings up the second architecture. A single library that handles the entire system from database configuration to scan timing and memory updating. This is essentially what the MatPLC is. I like this idea but it seems that it might be more difficult to implement the hot-swap modules and dynamic data allocation in this environment. It seems to me that there needs to be some capability in the system to monitor the health of the modules so that it knows when to free the modules data and remove the modules messages from the queue. This would probably still be possible in a library but all of the information would have to be stored in a shared memory segment or message queue of some kind. There may be a way to have truly global data in a shared library but I don’t know how to do it. This would also generate some kind of cooperative multitasking arrangement. The process management side of the library would probably be at the mercy of the individual modules. This would not be acceptable but there may be ways around it too.

I’d like to stick with the library only approach if it’d work. I can’t get my mind around how module and data table registration would take place in a library only situation. It’s simple enough to have the library store the data structures and process information in a shared memory segment and then deal with them with semaphores and such but I think this may simply be too complicated.

The implementation that I am leaning toward is a central daemon process paired with a library of functions to handle data transfer and messaging. The advantages that I see for the library is that there can be a distinction on what is contained in the modules process space and what is contained in the core. The daemon would be somewhat “kernel-like.” It would handle the module processes and data map issues that were central to the core. It may actually become the core. A large amount of data concerning each module (which tags it owns, its dependancies, etc.) could be managed in this core. Signals could be sent to module processes from this process if need be and the redundancy of the modules could be handled from this process.

The obvious disadvantage of the central daemon process is the single point of failure that it becomes. A set of processes that share a common library don’t necessarily have any single point of failure other than the modules interdependencies. An HMI module could fail and the logic modules could still keep running. The central daemon would have to stay running or all the modules would become useless. The logic modules could still operate without the HMI module but nothing works without the core daemon. This is where redundancy comes in. This core process would need to have the ability to transfer control to a similar module on another machine. All the better reason to keep this thing simple and let all the heavy lifting be done at the module level

A slight departure from this method would be a completely modular approach similar to a modern PLC. Everything is a module, with a common library to emulate a backplane. This is similar to the library only idea except that there would have to be some kind of “controller” module to handle all of the “other” modules. The main drawback that I see in this architecture is redundancy of code that would happen inside the “controller” modules. The controller itself would almost need to be some kind modular code base so that a user could have ladder programming or launching scripts etc. There would also be portability issues with modules that could work with one type of controller but not with another. In essence this would become the central daemon idea. The obvious benefit is that there could exist multiple controllers in one running system but really there is nothing preventing that in any of the other architectures either. The difference is how the data gets passed from module to module and how messages are handled.

Another drawback to this approach is the assumption that this thing will “control” something. It may exist on a particular computer simply to act as an HMI to the controller that is on another machine.

The final architecture that I can see is one where there is a single running process with dynamically loadable libraries to act as modules. This is basically how PAM (Pluggable Authentication Modules) works. This has some major drawbacks. The biggest is the ease of developing new modules. The first requirement that I listed above is the easy creation of modules. A dynamically loadable shared library doesn’t fit that mold. It also precludes being able to use interpreted scripting languages as modules. It also may not be very portable. All modern desktop operating systems have dynamically loadable libraries but there are some issues with this on smaller embedded systems. There may be some licensing issues involved here too. Did I mention that a wayward module could bring down the whole system. I’d rather have the modules as separate process and let Linux do what Linux does best, handle processes.

Now about licensing. I spent some time last night reading about the GPL. I had about decided that I was going to release the main core process program as GPL software and the library as LGPL. This may still be the way to go but I need to think hard about the LGPL. There are some commercial implications for this software. To truly turn into something useful it would need to have some commercial support. This means that companies and consultants need to be able to write proprietary modules and sell them to their customers. Another benefit of the LGPL is one of I/O for commercial hardware. Allen Bradley could write a binary module to talk to some of it’s controllers. I would like to be able to limit the users ability to write logic modules for customers and keep them proprietary but I’d like for big automation companies to be comfortable writing I/O module while still protecting their intellectual property (sorry FSF didn’t mean to say bad words).

If an end user developed a module to do PID loop calculations and then decided to box them up and sell them without also distributing the source, we’d all lose. But if Allen Bradley would have otherwise written an I/O driver for the PCICS card except for the GPL then that would be equally as bad. I’m torn. Perhaps we need to have two libraries. One that can be used simply as an I/O module library (no access to another modules data) and release that one LGPL and then fully functional library would be released under the GPL. Then any actual improvements to reusable logic type modules would be free and large hardware vendors could still write I/O modules for their hardware. It’d be hard to succeed in this industry without being able to communicate with existing hardware. How ’bout a linux box controlling a ControlLogix rack full of I/O. Pretty cool idea huh?

There are issues also associated with the end user of the software. Say for instance you are hired as a consultant to build a system for a client. In the process of this work you build a module specifically for that client, and that client has you sign an NDA to protect an idea on the way they operate that particular process. The GPL still seems to work in this environment. The GPL says that you have to distribute the source code if you distribute the binary. The consultant would not be able to give his custom module to the client without the client also getting the source code. The client could choose not to distribute the module beyond that and therefore would have no obligation to distribute the source code. You the consultant would be bound by the NDA not to distribute the code to another client, so the client is protected. The one that is not allowed to protect his code is the consultant.

It’s important to note that this only applies if the consultant writes a module that links against the library. If he simply configures the system in a certain way and then writes some ladder logic code that is then interpreted by a ladder logic module, he is not required by the GPL to release the source of his ladder program to the client. This is one of the reasons that I would like to see the logic modules implemented as interpreters and not as module compilers. It gives an otherwise weary consultant a way to keep his code proprietary and still maintain the integrity of the GPL. Interpreters like Ruby, Perl, Python and Java might fall in the gray area. The bindings would be in the language and as such the bindings would be open source but the programs themselves would probably not fall under the GPL. This is okay with the spirit of the project because it’s the bindings that we’d want to see as a part of the community so that others could use interpreters as modules but the community doesn’t really care about how to control a bar-fooing process.

I think I’ll probably start down the path of using a central daemon process with two libraries. One LGPL library limited to accessing it’s own data map, anther full featured library released under the GPL. This seems like the simplest solution and simple generally means better. I may find some reasons to change but the only other option is the library only approach and it just seems far too complex to me. I’ll starting dreaming up data structures and interfaces and see how it all starts to fit together. Please feel free to give me some comments on this.