2011-08-22

Writing a Linux device driver module for kernels 2.6 or later with udev

This is a short tutorial for a sample character device module aimed at Linux kernels 2.6 and later (including 3.x and 4.x) using udev.
I thought might as well produce my own tutorial, after finding that most of the ones floating around the net are either lacking in terms of features, or not as up to date as could be with regards to the latest kernel advancements.

Some of the features demonstrated with this sample are:
  • creation of a character device as a module
  • use of udev, for the device instantiation
  • use of a kernel FIFO, for data queuing
  • use of a mutex, to prevent concurrent access to the same device
  • use of sysfs, as a means to add data to the queue
  • detection of access mode on open, to ensure read-only access
  • provision of module options during insmod, to enable debug output
Not bad for a short sample, eh?

Of course, this example is just there to help get you started, and does not replace a proper guide to Linux driver development, such as the freely available and strongly recommended "Linux Device Drivers - Third Edition" from O'Reilly.


What's the deal with udev?

If you have some basic knowledge of device drivers on UNIX systems, you probably are already aware that devices are instantiated into the dev/ directory through the use of a (major, minor) pair of integers. UNIX veterans will no doubt recall how they used to call on mknod to create their devices, and some UNIX systems still require that. Well into the 2.6 kernels, Linux was also more or less using the statically allocated major/minor approach, through its devfs implementation. At one stage however, kernel developers got slightly annoyed about having to coordinate the allocation of major/minor numbers, not being able to keep /dev as clean and as abstracted as they liked and also feared that Linux would run out of numbers to assign. Because of that, they developed a new approach to device instantiation called udev, which led them to eventually ditch devfs altogether in kernel 2.6.18. For more information about udev vs devfs, you may want to read Greg Kroah-Hartman's The final word from December 2003.

How this matters to you as a driver writer is that a lot of existing tutorials you will find rely on devfs or static allocation of a major and minor number, which doesn't apply to recent kernels. Thus, you want a tutorial that demonstrates how to use udev, such as the one provided here.
For the record, this sample driver was tested with kernel 3.0.3 as well as 2.6.39.


Device description

The origin of this device driver sample mostly stems from wanting to craft a virtual device to simulate an HDMI-CEC enabled TV, in order to facilitate the development of libcec/cecd. Basically, with such a driver, a libcec application can connect to a virtual device that plays a pre-recorder set of data, which avoids the need to have an actual HDMI-CEC device plugged in or even a development platform that has an HDMI port.

Since the driver above is fairly straightforward to write, it is a good candidate to serve as a base for a sample. In order to turn it into something that is generic enough for a tutorial, we did remove anything that was HDMI-CEC related and turned the driver into one that can store sequences of character data and then repeat them in the order they were received, pretty much as a FIFO for text string. Given this feature description, it makes a lot of sense to call our little device 'parrot'.

For this 'parrot' device then, what we need first is a buffer to store data. The smart approach here, rather than reinvent the wheel is to use a Kernel FIFO, as this is a feature provided by the kernel and specifically aimed at device drivers. This will greatly simplify our design and of course, if you use this sample as a base for a driver that does actual hardware I/O, knowing how to use a Kernel FIFO will no doubt come handy.

Now, while we obviously are going to use read accesses to the actual device (cat /dev/parrot) to read FIFO messages, we are not going to use write accesses to the device to fill the FIFO. Instead we will use sysfs, and make our device read-only.
This may sound counter-intuitive, aside from wanting to demonstrate the use of sysfs, but the reason is, in a typical virtual repeater device scenario, you would use separate FIFOs for read and write, or a FIFO for read and a sink for write, since you don't want data written by your application to the device to interfere the pre-defined scenario you have for readout. Of course, if you want to add a device write endpoint to our sample device, it is a simple matter of copy/pasting the read procedures, and modify them for write access.

Finally, because this is a sample, we will add some debug output which we will toggle using a module parameter on insmod as it is a good example to demonstrate how driver load options can be handled.


Source Walkthrough

The open()/close() calls should be fairly straightforward. The open() call demonstrates how one can use the struct file f_flags attribute to find out whether the caller is trying to use the device for read-only, write-only or read/write access. As indicated, we forcibly prevent write access, and return an error if the device is open write-only or read/write. Then we use the non-blocking mutex_trylock() and mutex_unlock(), along with a global device mutex, to prevent more than one process from accessing the device at any one time. This way, we won't have to worry about concurrent read accesses to our Kernel FIFO.

The device_read() call uses the kfifo_to_user() to copy data from the Kernel FIFO into userspace. One thing you must be mindful of, when writing device drivers, is that kernelspace and userspace are segregated so you can't just pass a kernel buffer pointer to a user application and expect it to work. Data must be duplicated to or from userspace. Hence the _to_user call. Now, because we store multiple messages in a flat FIFO, and don't use specific message terminators such as NUL, we need a table to tell us the length of each message. This is what the parrot_msg_len[] (circular) table provides. For this to work, we also need 2 indexes into it: one to point to the length of the next message to be read (parrot_msg_idx_rd), and one to point to the next available length entry for a write operation (parrot_msg_idx_wr). We could also have used _enqueue and _dequeue for the suffixes, as this is what you will generally find in a driver source.

The one_shot and message_read variables are used to alleviate a problem you will face if you use cat to read from the device. The problem with cat is that it will continue to issue read requests, until the device indicates that there is no more data to be read (with a read size of zero). Thus, if left unchecked, cat will deplete the whole FIFO, rather than return a single message. To avoid that, and because we know that cat will issue an open() call before attempting to read the device, we use the boolean message_read to tell us if this is our first read request since open or not. If it isn't the first request, and the one_shot module parameter is enabled (which is the default), we just return zero for all subsequent read attempts. This way, each time you issue a cat, you will at most read one message only.

With our open(), close() and read() calls properly defined, we can set the file operations structure up, which we do in the static struct file_operations fops. For more on the file operations which you can use for character devices, please see Chapter 3 of the Linux Device Drivers guide.

All the above takes care of our basic driver workhorse. Yet we still haven't provided any means for the user to populate the FIFO. This is done through the next subroutine called sys_add_to_fifo().

The first thing we do there is check for overflows on either the FIFO or the message length table. Then, we use kfifo_in() to copy the data provided by the user to the sysfs endpoint (see below). Note that, in this case, the sysfs mechanisms have already copied the data provided by the user into kernelspace, therefore using kfifo_in() just works, and there is not need to call kfifo_from_user(), as we would have had to do if we were writing a regular driver write procedure.

The other sysfs endpoint we provide is a FIFO reset facility, implemented in sys_reset, which is very straightforward.

With our sysfs functions defined, we can use the DEVICE_ATTR() macros to define the structure which we'll have to pass to the system during the module initialization. The S_IWUSR parameter indicates that we want users to have write access to these endpoints.

Now, we come to the last part of our driver implementation, with the actual module initialization and destruction. In module_init() the first thing we do, and this is the part where we rely on udev, is to dynamically allocate a major, through register_chrdev(). This call takes 3 parameters: The first one is either a specific major number, or 0 if you want the system to allocate one for you (our case), the second is a string identifier (which can be different from the name you will use in /dev) and the last parameter is a pointer to a file operations structure.

Next, we have a choice of creating a device associated with a class or a bus. For a simple sample such as ours, and given that our device is not tied to any hardware or protocol, creating a bus would be an overkill, so we will go with a class. For laymen, the difference between using a bus and a class is that you will get your devices listed in /sys/devices/<classname> in sysfs, instead of /sys/devices/virtual, and you will also have them under <classname> in /dev (though it is possible to achieve the same if you add a "/" in your device name when calling device_create()).

With a class established, we can then call device_create() which performs the actual instantiation of our device. The description of the device_create() parameters can be found here. We use the MKDEV() macro to create a dev_t out of the major we previously obtained, using 0 as the minor as we only have one device. It is at this stage that a "parrot_device" gets created in /dev, using the parameter provided for the name.

The next calls to device_create_file() are used to create the two sysfs endpoints, fifo and reset, out of the structures previously defined with DEVICE_ATTR(). Because we use a class, this results in the /sys/devices/virtual/parrot/parrot_device/fifo and /sys/devices/virtual/parrot/parrot_device/reset endpoints being created.

Finally we end the module_init() call with the remainder of the device initialization (FIFO, mutex, indexes).

The remainder of the code should be fairly explicit so I won't comment on it.


Source:

parrot_driver.h:
/*
 * Linux 2.6 and later 'parrot' sample device driver
 *
 * Copyright (c) 2011-2015, Pete Batard <pete@akeo.ie>
 *
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation; either version 2 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program. If not, see <http://www.gnu.org/licenses/>.
 */

#define DEVICE_NAME "device"
#define CLASS_NAME "parrot"
#define PARROT_MSG_FIFO_SIZE 1024
#define PARROT_MSG_FIFO_MAX  128

#define AUTHOR "Pete Batard <pete@akeo.ie>"
#define DESCRIPTION "'parrot' sample device driver"
#define VERSION "1.2"

/* We'll use our own macros for printk */
#define dbg(format, arg...) do { if (debug) pr_info(CLASS_NAME ": %s: " format, __FUNCTION__, ## arg); } while (0)
#define err(format, arg...) pr_err(CLASS_NAME ": " format, ## arg)
#define info(format, arg...) pr_info(CLASS_NAME ": " format, ## arg)
#define warn(format, arg...) pr_warn(CLASS_NAME ": " format, ## arg)

parrot_driver.c:
/*
 * Linux 2.6 and later 'parrot' sample device driver
 *
 * Copyright (c) 2011-2015, Pete Batard <pete@akeo.ie>
 *
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation; either version 2 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program. If not, see <http://www.gnu.org/licenses/>.
 */

#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/init.h>
#include <linux/slab.h>
#include <linux/fs.h>
#include <linux/device.h>
#include <linux/types.h>
#include <linux/mutex.h>
#include <linux/kfifo.h>
#include "parrot_driver.h"

/* Module information */
MODULE_AUTHOR(AUTHOR);
MODULE_DESCRIPTION(DESCRIPTION);
MODULE_VERSION(VERSION);
MODULE_LICENSE("GPL");

/* Device variables */
static struct class* parrot_class = NULL;
static struct device* parrot_device = NULL;
static int parrot_major;
/* Flag used with the one_shot mode */
static bool message_read;
/* A mutex will ensure that only one process accesses our device */
static DEFINE_MUTEX(parrot_device_mutex);
/* Use a Kernel FIFO for read operations */
static DECLARE_KFIFO(parrot_msg_fifo, char, PARROT_MSG_FIFO_SIZE);
/* This table keeps track of each message length in the FIFO */
static unsigned int parrot_msg_len[PARROT_MSG_FIFO_MAX];
/* Read and write index for the table above */
static int parrot_msg_idx_rd, parrot_msg_idx_wr;

/* Module parameters that can be provided on insmod */
static bool debug = false; /* print extra debug info */
module_param(debug, bool, S_IRUGO | S_IWUSR);
MODULE_PARM_DESC(debug, "enable debug info (default: false)");
static bool one_shot = true; /* only read a single message after open() */
module_param(one_shot, bool, S_IRUGO | S_IWUSR);
MODULE_PARM_DESC(one_shot, "disable the readout of multiple messages at once (default: true)");


static int parrot_device_open(struct inode* inode, struct file* filp)
{
 dbg("");

 /* Our sample device does not allow write access */
 if ( ((filp->f_flags & O_ACCMODE) == O_WRONLY)
   || ((filp->f_flags & O_ACCMODE) == O_RDWR) ) {
  warn("write access is prohibited\n");
  return -EACCES;
 }

 /* Ensure that only one process has access to our device at any one time
  * For more info on concurrent accesses, see http://lwn.net/images/pdf/LDD3/ch05.pdf */
 if (!mutex_trylock(&parrot_device_mutex)) {
  warn("another process is accessing the device\n");
  return -EBUSY;
 }

 message_read = false;
 return 0;
}

static int parrot_device_close(struct inode* inode, struct file* filp)
{
 dbg("");
 mutex_unlock(&parrot_device_mutex);
 return 0;
}

static ssize_t parrot_device_read(struct file* filp, char __user *buffer, size_t length, loff_t* offset)
{
 int retval;
 unsigned int copied;

 /* The default from 'cat' is to issue multiple reads until the FIFO is depleted
  * one_shot avoids that */
 if (one_shot && message_read) return 0;
 dbg("");

 if (kfifo_is_empty(&parrot_msg_fifo)) {
  dbg("no message in fifo\n");
  return 0;
 }

 retval = kfifo_to_user(&parrot_msg_fifo, buffer, parrot_msg_len[parrot_msg_idx_rd], &copied);
 /* Ignore short reads (but warn about them) */
 if (parrot_msg_len[parrot_msg_idx_rd] != copied) {
  warn("short read detected\n");
 }
 /* loop into the message length table */
 parrot_msg_idx_rd = (parrot_msg_idx_rd+1)%PARROT_MSG_FIFO_MAX;
 message_read = true;

 return retval ? retval : copied;
}

/* The file_operation scructure tells the kernel which device operations are handled.
 * For a list of available file operations, see http://lwn.net/images/pdf/LDD3/ch03.pdf */
static struct file_operations fops = {
 .read = parrot_device_read,
 .open = parrot_device_open,
 .release = parrot_device_close
};

/* Placing data into the read FIFO is done through sysfs */
static ssize_t sys_add_to_fifo(struct device* dev, struct device_attribute* attr, const char* buf, size_t count)
{
 unsigned int copied;

 dbg("");
 if (kfifo_avail(&parrot_msg_fifo) < count) {
  warn("not enough space left on fifo\n");
  return -ENOSPC;
 }
 if ((parrot_msg_idx_wr+1)%PARROT_MSG_FIFO_MAX == parrot_msg_idx_rd) {
  /* We've looped into our message length table */
  warn("message length table is full\n");
  return -ENOSPC;
 }

 /* The buffer is already in kernel space, so no need for ..._from_user() */
 copied = kfifo_in(&parrot_msg_fifo, buf, count);
 parrot_msg_len[parrot_msg_idx_wr] = copied;
 if (copied != count) {
  warn("short write detected\n");
 }
 parrot_msg_idx_wr = (parrot_msg_idx_wr+1)%PARROT_MSG_FIFO_MAX;

 return copied;
}

/* This sysfs entry resets the FIFO */
static ssize_t sys_reset(struct device* dev, struct device_attribute* attr, const char* buf, size_t count)
{
 dbg("");

 /* Ideally, we would have a mutex around the FIFO, to ensure that we don't reset while in use.
  * To keep this sample simple, and because this is a sysfs operation, we don't do that */
 kfifo_reset(&parrot_msg_fifo);
 parrot_msg_idx_rd = parrot_msg_idx_wr = 0;

 return count;
}

/* Declare the sysfs entries. The macros create instances of dev_attr_fifo and dev_attr_reset */
static DEVICE_ATTR(fifo, S_IWUSR, NULL, sys_add_to_fifo);
static DEVICE_ATTR(reset, S_IWUSR, NULL, sys_reset);

/* Module initialization and release */
static int __init parrot_module_init(void)
{
 int retval;
 dbg("");

 /* First, see if we can dynamically allocate a major for our device */
 parrot_major = register_chrdev(0, DEVICE_NAME, &fops);
 if (parrot_major < 0) {
  err("failed to register device: error %d\n", parrot_major);
  retval = parrot_major;
  goto failed_chrdevreg;
 }

 /* We can either tie our device to a bus (existing, or one that we create)
  * or use a "virtual" device class. For this example, we choose the latter */
 parrot_class = class_create(THIS_MODULE, CLASS_NAME);
 if (IS_ERR(parrot_class)) {
  err("failed to register device class '%s'\n", CLASS_NAME);
  retval = PTR_ERR(parrot_class);
  goto failed_classreg;
 }

 /* With a class, the easiest way to instantiate a device is to call device_create() */
 parrot_device = device_create(parrot_class, NULL, MKDEV(parrot_major, 0), NULL, CLASS_NAME "_" DEVICE_NAME);
 if (IS_ERR(parrot_device)) {
  err("failed to create device '%s_%s'\n", CLASS_NAME, DEVICE_NAME);
  retval = PTR_ERR(parrot_device);
  goto failed_devreg;
 }

 /* Now we can create the sysfs endpoints (don't care about errors).
  * dev_attr_fifo and dev_attr_reset come from the DEVICE_ATTR(...) earlier */
 retval = device_create_file(parrot_device, &dev_attr_fifo);
 if (retval < 0) {
  warn("failed to create write /sys endpoint - continuing without\n");
 }
 retval = device_create_file(parrot_device, &dev_attr_reset);
 if (retval < 0) {
  warn("failed to create reset /sys endpoint - continuing without\n");
 }

 mutex_init(&parrot_device_mutex);
 /* This device uses a Kernel FIFO for its read operation */
 INIT_KFIFO(parrot_msg_fifo);
 parrot_msg_idx_rd = parrot_msg_idx_wr = 0;

 return 0;

failed_devreg:
 class_destroy(parrot_class);
failed_classreg:
 unregister_chrdev(parrot_major, DEVICE_NAME);
failed_chrdevreg:
 return -1;
}

static void __exit parrot_module_exit(void)
{
 dbg("");
 device_remove_file(parrot_device, &dev_attr_fifo);
 device_remove_file(parrot_device, &dev_attr_reset);
 device_destroy(parrot_class, MKDEV(parrot_major, 0));
 class_destroy(parrot_class);
 unregister_chrdev(parrot_major, DEVICE_NAME);
}

/* Let the kernel know the calls for module init and exit */
module_init(parrot_module_init);
module_exit(parrot_module_exit);
The relevant Makefile is provided in the parrot source archive (link below).

Testing:
root@linux:/ext/src/parrot# make
make -C /usr/src/linux SUBDIRS=/mnt/hd/src/parrot modules
make[1]: Entering directory `/mnt/hd/src/linux-3.0.3'
  CC [M]  /mnt/hd/src/parrot/parrot_driver.o
  Building modules, stage 2.
  MODPOST 1 modules
  CC      /mnt/hd/src/parrot/parrot_driver.mod.o
  LD [M]  /mnt/hd/src/parrot/parrot_driver.ko
make[1]: Leaving directory `/mnt/hd/src/linux-3.0.3'
root@linux:/ext/src/parrot# insmod parrot_driver.ko debug=1
root@linux:/ext/src/parrot# lsmod
Module                  Size  Used by
parrot_driver           4393  0
root@linux:/ext/src/parrot# ls -lF /dev/parrot_device
crw------- 1 root root 251, 0 2011-08-22 22:04 /dev/parrot_device
root@linux:/ext/src/parrot# ls -lF /sys/devices/virtual/parrot/parrot_device/
total 0
-r--r--r-- 1 root root 4096 2011-08-22 22:05 dev
--w------- 1 root root 4096 2011-08-22 22:05 fifo
drwxr-xr-x 2 root root    0 2011-08-22 22:05 power/
--w------- 1 root root 4096 2011-08-22 22:05 reset
lrwxrwxrwx 1 root root    0 2011-08-22 22:04 subsystem -> ../../../../class/parrot/
-rw-r--r-- 1 root root 4096 2011-08-22 22:04 uevent
root@linux:/ext/src/parrot# echo "this should generate an error" > /dev/parrot_device
-bash: /dev/parrot_device: Permission denied
root@linux:/ext/src/parrot# echo "Yabba Dabba Doo" > /sys/devices/virtual/parrot/parrot_device/fifo
root@linux:/ext/src/parrot# echo "Yabba Dabba Daa" > /sys/devices/virtual/parrot/parrot_device/fifo
root@linux:/ext/src/parrot# cat /dev/parrot_device
Yabba Dabba Doo
root@linux:/ext/src/parrot# cat /dev/parrot_device
Yabba Dabba Daa
root@linux:/ext/src/parrot# echo "test" > /sys/devices/virtual/parrot/parrot_device/fifo
root@linux:/ext/src/parrot# echo "test" > /sys/devices/virtual/parrot/parrot_device/fifo
root@linux:/ext/src/parrot# echo 1 > /sys/devices/virtual/parrot/parrot_device/reset
root@linux:/ext/src/parrot# cat /dev/parrot_device
root@linux:/ext/src/parrot#

Links:

2011-08-12

Which controllers support USB 3.0 Debug?

As defined per section 7.6 of the xHCI specifications?

Unfortunately, none of the current ones from Renesas, VIA Labs or Fresco Logic do, and neither of these manufacturers are planning to add the capability on their existing controllers, through a firmware update for instance... Instead, they only have plans to provide it on their next generation controllers, which won't be available for some time (yes, I am aware that the new Renesas chips have been announced since 2011.03, but I have yet to see a PCI-E expansion card with one of those -- If you actually managed to get your hands on one, I'd like to hear from you!  
UPDATE 2011.08.29: Apparently those guys did... but as a world exclusive, since these cards won't be available till later on this year. Darn!).
UPDATE 2012.01.10: At long friggin' last, uPD720201 based PCIE cards, with debug support, are starting to become available, one such being the Buffalo IFC-PCIE4U3S. Only seems to be available in Asia and Australia at the moment however, but even then, orders may not be satisfied before February because of lack of availability...

So this means, if you have a NEC/Renesas uPD720200/uPD720200A, a VIA Labs VL800/801 or a Fresco Logic FL1000/FL1009 based USB 3.0 controller, you're out of luck with regards to USB 3.0 debug.
This doesn't bode too well for USB 3.0 as replacement for legacy RS232 ports...
Oh and the other disappointing part of the current crop of PCIE-USB3 controllers is they don't support USB-3.0 boot either. Quite the letdown...

Hint: If you want to display the Extended Capabilities of your controller, you can do so by modifying the xhci_setup_port_arrays() of drivers/usb/host/xhci-mem.c in () on a recent Linux kernel, as it already has some code to scan the extended caps. Using this method, I was able to find that, as of firmware 4015, the only Extended Capabilities provided by an uPD720200 USB 3.0 controller was "USB Legacy Support" (xHCI specs chpater 7.1) and "xHCI Supported Protocol" (7.2). The same applies for a Fresco Logic FL1000 based controller that I just got my hands on, though this one also sports vendor specific Extended Capabilities, with IDs 192 and 193.

Below are the answers from various manufacturers regarding their planned support of USB 3.0 debug. Being mostly interested in addon cards for existing systems, I haven't asked Intel or AMD.

NEC/Renesas:
"The uPD720200 does not support the debug port capability, and there is no plan to add it. But we have already added debug port support to our newer USB3 host controllers, uPD720201 and uPD720202."
VIA Labs:
"VL800/801 doesn't support debug port capability. As I know, the only one support is Renesas 3rd gen. host controller UPD720201_202, we plan to add this feature with VL805 that will MP next year."
Fresco Logic:
"FL1000/FL1009 didn't support USB debug extended capability.
We are doing it (for the next generation of controllers) and the schedule will be 2012/Q1."

2011-08-09

Enabling option ROM and flashing gPXE on an SMC 1211

If you're going to play with PCI option ROMs, you're likely to salvage a PCI Network Interface Card (NIC) with a flash ROM, or at least one that has a socket for it. And given their popularity in the late 90s/early 2ks, you have a fair chance getting your hands on with a RealTek RT8139 based one. One such NIC is is the SMC 1211 (or "Accton Technology Corporation SMC2-1211TX" as reported by lscpi -nn on Linux), which, in its basic configuration, comes with a DIP socket for 5V flash chips up to 128 KB in size. This is actually quite a desirable card to have as it is beautifully supported by flashrom, which has great support for the RT8139 chip, and no extra tweaking is needed to support the maximum flash size of 128 KB, as can be the case with other NICs. Add a W29C011A-15 or compatible, which can be easily obtained or salvaged from an old motherboard and you'll have more than enough space to play with an option ROM.

Only problem of course is that most SMC 1211s don't come with a flash chip by default so the option ROM is disabled and needs to be re-enabled. Luckily, Realtek does provide a tool called rset8139.exe to do just that. Of course, rather than blindly trusting the tool before we go and play with a custom option ROM, we may as well attempt to check that everything is in order so we are first going to flash a proper one, such as gPXE before running the rset8139 tool. Off we go then to etherboot/gPXE's awesome rom-o-matic, fill our options, including the 1113:1211 VID:PID of our SMC card and get our gPXE ROM back then.

First order of the day, since we're using a 128 KB flash chip and the gPXE ROM we got was smaller, is pad our ROM to our target size with:

cat gpxe-1.0.1-11131211.rom /dev/zero | dd bs=1k count=128 > gpxe_128k.rom

Then we flash with:

flashrom -p nicrealtek -w gpxe_128k.rom

So that takes care of having a proper option ROM. Of course, since we haven't enabled it on the card, you will find that no matter the options you select in your BIOS, the SMC option ROM is not executed and this is precisely why we need to run that rset8139 utility. Now, it is possible that Linux or Windows version of this utility exist, but it looks like the most common version is the DOS one, which, thanks to the oh-so-convenient Rufus (see this post), running from a DOS bootable USB stick is no problem at all.

One important thing to note is there exists multiple versions of the utility, ranging from 5.00 to 5.09, and that not all of these appear to detect the SMC 1211 card. Some regression has been introduced by Realtek, which leaves the most recent versions of rset8139 unable to change settings for the 1211. Therefore, the version I recommend using if you have an SMC card is v5.01, which can be picked here. If you don't have an SMC card, then you can try your luck with v5.0.9, which appears to be the most recent, and which is available here.

Once you have created your DOS bootable USB stick and copied rset8193.exe over, you should end up with something similar to the screenshot below (courtesy of desconexão.net), where you will be able to enable and set your ROM size:


After these settings are saved and you reboot, you should find that the gPXE option ROM payload is now executed by your BIOS, and with that you can get cracking on building a custom option ROM using your RT8139 based card.

2011-08-08

USB_DOS: The (once) easiest way to create a DOS bootable USB stick on Windows

(Updated 2011.12.14: this guide is now obsolete and has been superseded. If you want to create an MS-DOS or FreeDOS bootable USB Flash Drive, please visit the Rufus page).

If you ever need to run a DOS utility (eg. to flash a BIOS or a firmware), and want to painlessly create a bootable DOS USB key from Windows, this is for you.

Courtesy of FreeDOS and the HP Bootable USB utility (HPUSBFW), the following archive contains everything you need to create a fully functional DOS bootable USB stick. Just extract the file below (using 7-zip if needed) and follow the instructions:

USB_DOS v1.0

Now, for some additional information, in case this may be of interest to you:
  • The HPUSBFW executable has been modified from the original to request elevation on Vista and later. If you remove the manifest you should find that it matches the official HP one.
  • The FreeDOS command.com and kernel.sys files come from the current version of FreeDOS (1.0). command.com has been patched to prevent it from requesting date and time input from the user at boottime. If you unpack command.com with upx, you should find that the only difference with official is a set of 4 NOPs (0x90).
    Oh, and yes I tried to recompile the latest FreeCOM from SVN, in a FreeDOS VMWare image, but it's been quite a struggle (hint: don't waste your time with OpenWatcom, use Borland Turbo C++ v2.01 instead) and the resulting command.com freezes after a few commands when used bare with kernel.sys, as is the case with HPUSBFW.

2011-08-03

UBRX - L2 cache as instruction RAM (CAiR)

Word of advice: if you want to play with CPU caches, and especially an L2 cache, don't use a Slot 1 Pentium III as your test system: L2 is disabled by default there and, because it's really some fast RAM that was added on the board that also has the CPU chip, initializing it take a little more effort than simply flipping a switch... If you want to know all about how to initialize L2 cache on Slot 1 Pentiums, you may want to have a look at the old freebios code.

But now, with L2 sorted out and as a famous robot once said: "Things are starting to look up..."

Why is L2 cache so important for UBRX you ask? Good question.
You see, our goal is to run binary code that was uploaded by the user, on a system that is considered RAM-less, therefore all we have at our disposal are the caches. Now, CAR (Cache As RAM) has been in use by coreboot for some time, but the problem with this implementation is that it only uses L1 cache. However, if you know your CPU architecture 101, you are aware that there are two L1 caches ondie: one for data and another for instructions, with the instruction one being read-only. Thus, the CAR setup method from coreboot only provides access to the L1 data cache, not the instruction one, so we can't simply upload our code into L1-Data and expect it to run.

On the other hand, the L2 cache is a unified one, which means that it works for both data and instruction. Thus, if we manage to get our code onto L2, and have all the caches in WriteBack mode, we should be able to get the CPU to fetch instructions, which we uploaded, from L2, and we're good. This is called Cache As instruction RAM (CAiR). And with L2 caches being more than 256 KB in size, we could actually run a hefty and quite complex section of code, rather than be limited to the 16 or 32 KB of L1.

To achieve that, simply upload the code you want into L1-Data (which would have been initialized as CAR), then read or write a contiguous section of data, from a different address, that is larger than your L1 cache. As L1-Data gets replaced, your code gets pushed onto L2, where it is not accessible for execution by the CPU.
Neat!

So, how does it work in the UBRX console? Like this:
s/u/r/q> s
$60000010 a             # disable cache
$11e c $0 d $01043531 m # setup L2 for PIII Slot 1
$2ff c $0 d $c00 m      # fixed + var MTRRs
$268 c $06060606 d m    # C0000-C7FFF as WriteBack
! $10 a                 # flush and enable cache
$8000 c $c0000 <        # preload region to L2-Unified
# load our code in L1-Data
$c0000 d $f8ba68b0 z    # 'h'
$c0004 d $65b0ee03 z    # 'e'
$c0008 d $ee03f8ba z
$c000c d $f8ba6cb0 z    # 'l'
$c0010 d $6cb0ee03 z    # 'l'
$c0014 d $ee03f8ba z
$c0018 d $f8ba6fb0 z    # 'o'
$c001c d $0db0ee03 z    # CR
$c0020 d $ee03f8ba z
$c0024 d $f8ba0ab0 z    # LF
$c0028 d $ffcbee03 z
$8000 c $c0030 >        # flush L1-Data onto L2-Unified

$c0800 b $c0000         # stack at C8000, code at C0000
.
s/u/r/q> r
hello
s/u/r/q>
Now, by flashing less than 4KB of your BIOS bootblock, you are able to run ANY code you want using the UBRX the recovery console, even if you don't have any RAM installed, with the added benefit that your code can be as large as your L2 cache. Neat!

All of this and more in UBRX v0.4.