Adding a Character LCD to an FPGA Project
Adding a character liquid crystal display (LCD) to an FPGA project is a simple and inexpensive way to get your project talking. In this post I discuss interfacing an FPGA with garden-variety generic 1602 LCD like the one pictured nearby. These alphanumeric displays include an integrated HD44780-compatible controller with a parallel interface that enables you to send data in ASCII format one byte at a time. The output is two lines of 16 characters each on a fixed dot matrix grid. They're cheap as chips (as my English friends say) and perfect for displaying measurements, results of calculations and any other simple messages.
My objective here was to develop a reusable Verilog module that could be dropped in to multiple FPGA projects and handle the low-level communication with the LCD. As part of an effort to make the Verilog as platform independent as possible, I developed and tested the module on two different boards. First is an Olimex ICE40HX8K-EVB board using the Project Icestorm workflow. (I discuss the board and Project Icestorm in an earlier post here: iCE40 and the IceStorm Open Source FPGA Workflow.) Second is an Arrow BeMicro CV board featuring an Intel/Altera 5CEFA2F23C8N Cyclone V, developed using Quartus 15.0.
The Hitachi HD44780 LCD controller has been around for decades. I don't have an exact year of release, but a quick search has it turning up in catalogs and magazines by the early-to-mid-eighties. Somewhere along the line Hitachi stopped making them, but compatible replacements exist including the Sunplus SPLC780, the Samsung KS0066U and the Sitronics ST7066U. While not all small text displays are HD44780-compatible, the instruction set and interface for the controller have become something of a standard.
Both 5v and 3.3v displays are available. While it's technically possible to interface a 5v version, I don't have any appetite for letting 5v devices get too close to the 3.3v IO pins on my FPGA board, so I'm of course using a 3.3v version.
Configuring these displays is reasonably easy and there is a lot of raw material online to help get a project started. The fact that they've been around so long means there are tutorials available for every imaginable microcontroller. They've even shown up on some Xilinx and Altera training and development boards over the years.
Any trouble people have using 1602s mostly seems to stem from confusion over the controller's timing requirements - in particular the timing around executing its moderately cumbersome initialization sequence. These timing conundra are compounded by some pretty murky writeups in the datasheets along with minor variations in the specifications.
I think I've got the timing pretty-well demystified and I'll provide some detail further in this posting. But at a high level I can say that it's very helpful to remember that the timing requirements from the datasheets are minimum requirements. You don't have to try to achieve these exact time values. In fact, unless you have an application that needs to squeeze every nanosecond of performance out of the display, you can save yourself a lot of headache by building in some nice buffer around the minimum timings.
But first the basics.
The interface to a 1602 controller consists of 16 pins as shown in the graphic below.
Five of the pins are dedicated to power, including a contrast control for adjusting the backlighting. Vss (ground) and Vdd (positive supply) are the main power for the logic. A and K are positive and ground to the backlight. The V0 pin adjusts the contrast to the display. Most setups use a 10k potentiometer on the contrast pin. You can tie it straight to ground through a 1K to 4K resistor, but the contrast is pretty sensitive and if you use a potentiometer you'll probably be happier.
Three of the pins are dedicated to control signals. The RS (register select) is used to indicate whether data on the bus is a character (RS=1) or an instruction (RS=0). The E (enable) pin is toggled on and off to enter the data. The R/W (read/write) pin switches between read (R/W=1) and write (R/W=0). R/W is frequently tied to ground and the LCD is treated as a write only device. (My setup has R/W tied to ground and the FPGA and Verilog ignore it.)
The remaining eight pins are the data bus. As noted, data on the bus can either be a character or an instruction. Characters include standard ASCII codes for the numbers 0-9, letters a-z and A-Z, and many of the general standard ASCII punctuation and symbols such as !, ?, #, and $. Beyond those characters, the LCD will have additional characters depending on the manufacturer of the board and the version of the controller. The controller can also manage user-defined custom characters.
The instruction set is fairly rich. Some key instructions for getting going include:
Instruction | HEX | Binary |
Clear display screen | 01 | 00000001 |
Return cursor home | 02 | 00000010 |
Shift cursor left | 10 | 00010000 |
Shift cursor right | 14 | 00010100 |
Shift display left | 18 | 00011000 |
Shift display right | 1C | 00011100 |
Move cursor to beginning of 1st line | 80 | 10000000 |
Move cursor to beginning to 2nd line | C0 | 11000000 |
Function Set: 2 lines, 5x7 matrix, 8-bit mode | 38 | 00111000 |
Entry Mode: Cursor moves right on new char | 06 | 00000110 |
The controller has an 8-bit mode where characters and commands are loaded one byte at a time, and a 4-bit mode where the 8-bit data is loaded in two 4-bit nibbles. The 4-bit mode saves output pins on your device at the cost of minor additional complexity sending the data. Here I'm using the full 8-bit mode.
At the highest level, controlling the display can be seen as a simple process of pushing a sequence of instructions and characters through by setting up the RS pin and the DB0-7 pins one byte at a time while toggling the E pin on and off to tell the controller as each byte is ready.
A more granular view of the process goes like this:
- Set the RS and DB0-7 pins, then delay at least 40ns (nanoseconds) to let them settle.
- Bring the E pin high and hold it for at least 230ns.
- Bring the E pin low, leaving RS and DB0-7 stable for at least 10ns more.
- Delay an additional minimum of 40µs (microseconds) for characters or 1.64ms (milliseconds) for instructions before entering the next byte.
A few notes on the above:
- If you study the datasheets you'll see that only RS needs to be stable before E goes high. It's therefore technically possible to bring DB0-7 up later, but it seems convenient and simplest to load the data in step one.
- The final delay is based on the time the controller needs to execute the character write or instruction. Not all instructions have the same duration. Instructions that clear the screen are the worst case at 1.64ms. For time-sensitive applications you can sub-divide the instructions category and push shorter duration instructions through more quickly.
- These timings are minimums and can vary a bit among manufacturers. I've had much better success adding some time to the first two delays.
Notional Verilog to write a character would look something like this:
module lcd(
clock_100,
rs,
e,
d
);
parameter STATE00 = 2'b00;
parameter STATE01 = 2'b01;
parameter STATE02 = 2'b10;
parameter DONE = 2'b11;
input clock_100;
output rs;
output e;
output [7:0] d;
reg rs = 1'b0;
reg e = 1'b0;
reg d = 8'b00000000;
reg [1:0] state = 2'b00;
reg [23:0] count = 24'h000000;
always @(posedge clock_100) begin
case (state)
STATE00: begin
if (count == 24'h000000) begin // if this is the first cycle of STATE00
rs <= 1'b1; // pull RS high to indicate data
d <= 8'h48; // load the databus with ASCII "H"
count <= count + 24'h000001; // increment the counter
end
else if (count == 24'h000005) begin // if 50ns have passed
count <= 24'h000000; // clear the counter
state <= STATE01; // advance to the next state
end
else begin // if it's not the first or last
count <= count + 24'h000001; // increment the counter
end
end
STATE01: begin
if (count == 24'h000000) begin // if this is the first cycle of STATE01
e <= 1'b1; // bring E high to initiate data read
count <= count + 24'h000001; // increment the counter
end
else if (count == 24'h000019) begin // if 250ns have passed
count <= 24'h000000; // clear the counter
state <= STATE02; // advance to the next state
end
else begin // if it's not the first or last
count <= count + 24'h000001; // increment the counter
end
end
STATE02: begin
if (count == 24'h000000) begin // if this is the first cycle of STATE02
e <= 1'b0; // bring E low
count <= count + 24'h000001; // increment the counter
end
else if (count == 24'h000FA0) begin // if 40us have passed
count <= 24'h000000; // clear the counter
state <= DONE; // advance to the next state
end
else begin // if it's not the first or last
count <= count + 24'h000001; // increment the counter
end
end
DONE: ;
default: ;
endcase
end
endmodule
This simplified (but working and complete) Verilog module uses a state machine to step through the phases of loading a character, and a counter to pace the delays. One minor issue is calculating the number of clock cycles needed for the different required delays. The code sample above is calculated out for my ICE40 board which has a 100MHz clock. The formula is simple enough: clock frequency in Hertz * required delay in seconds. You just wind up with a lot of zeros on both sides of the decimal point - e.g. a 50ns delay at 100MHz = 100000000 * 0.00000005 = 5 clock cycles.
With an understanding of the fundamental operation of the controller, two hurdles remain in putting together a full solution - addressing the HD44780's initialization routine and developing a concept of operations for passing data to the driver from other parts of the design.
Before data or instructions can be written to the controller, an initialization sequence must be executed that configures the controller and makes sure it's ready to receive input. The HD44780 does run through a version of the initialization when it first powers up, but there are a few problems with that routine. The hardware initialization is somewhat fragile and very dependent on the circuit's power supply. It also sets the controller to a default that might not be the desired state. Best practice looks to be including initialization in the design. For my own use, I have the FPGA drive the full initialization sequence on startup and have also tied the sequence to a push button on my board for a soft reset.
The sequence itself is simply a series of instructions that can be implemented using the same approach as was shown in the general case above. The only added complexity is that the timings for the delay between instructions varies, with some delays being considerably longer than others. For some reason the datasheets all seem to present this sequence as a flow chart which only serves to over-complicate the process.
Step |
Instruction (binary) |
Delay |
1. Power up |
No Instruction |
100ms |
2. Function Set |
8'b00110000 |
5ms |
3. Function Set |
8'b00110000 |
200us |
4. Function Set |
8'b00110000 |
200us |
5. Configure |
8'b00111000 |
60us |
6. Display Off |
8'b00001000 |
60us |
7. Clear |
8'b00000001 |
5ms |
8. Entry Mode Set |
8'b00000110 |
60us |
9. Display On |
8'b00001100 |
60us |
Some things to note for the initialization sequence:
- As with other parts of this process, I'm padding the delays here with a little more time than the datasheets present as minimum. Some of the extra time was added while I was troubleshooting and determined more time was needed for reliable performance - other places I just added time for good measure.
- Steps 2,3 and 4 all load the same instruction to the controller. Once RS and D[0-7] are set on the FPGA it is not necessary to keep resetting them - all that is needed is to bring the enable pin high and low at the appropriate times.
- Step 5 configures the LCD for the particular application - in this case 8-bit mode, 2 lines, 5x7 font. There are other options for configuration.
- Step 8 configures the Entry Mode which determines what the controller does as it receives new characters. The setting here fills in text from left to right while the display itself stays still. There are multiple options.
To implement the initialization sequence, I again use a state machine to step through the process. I actually combine the initialization and operating phases into one 26-step state machine (code further below). Of course, my solution is only one of many possible approaches. In researching for this project and hunting up examples of FPGAs driving LCDs, I found tremendous variety in the ways that people setup their state machines and manage the timing. It's certainly worth looking around for inspiration.
In the full version of the code, I've moved the counting function outside of the state machine and setup a counter that sets flags at various key timing thresholds. You'll note also that I parameterized the timing. The parameterization makes it easier to track the timings through the code. More important, the parameterization makes it effortless to use the module at different clock speeds. It is only necessary to adjust the one value for clock speed and all subsequent delays adjust accordingly.
The final steps in making a usable LCD driver involve reconfiguring the module to process incoming data (rather than just displaying the letter "H" in the notional code) and establishing some type of handshaking.
Adding input to the basic notional module for data is trivial. I chose to treat it as one 9-bit chunk of data that includes both the RS and DB0-7 bits, but they could just as easily be conceptualized as two separate inputs.
Slightly trickier is the idea of establishing communication between the LCD driver module and the hypothetical other module producing the data. The LCD driver needs to know when there is fresh data available to be written to the display. Likewise, since the LCD driver needs considerable time to process a byte of data, the data producing module needs to know when the LCD driver is ready. In the code below you'll see that I addressed these two parallel needs with a "data_ready" input bit and a "busy_flag" output bit.
In states 23-25, which comprise the portion of the state machine that actually writes data to the LCD, state 23 will idle until the data_ready bit goes high. Once the bit goes high, the module sets its own internal tracking bit called "start" to high, sets the busy_flag output high, and starts processing the incoming data. The busy_flag and start bits remain high until the process is complete.
Another subtlety of the updated state machine is that state 25 now checks the ninth bit (RS) to distinguish between character and instruction input. When the RS indicates an instruction, the delay is longer.
Below is the Verilog code for the full LCD driver module. This module has been tested with a hello world demonstration. I'm not displaying the full demo, but you can download it here: Verilog LCD Hello World Demonstration. The demonstration adds a ROM with the ASCII codes for Hello World and a controller that interfaces with the LCD module to feed one character at a time.
module lcd(
clock,
internal_reset,
d_in,
data_ready,
rs,
e,
d,
busy_flag
);
parameter CLK_FREQ = 100000000;
parameter integer D_50ns = 0.000000050 * CLK_FREQ;
parameter integer D_250ns = 0.000000250 * CLK_FREQ;
parameter integer D_40us = 0.000040000 * CLK_FREQ;
parameter integer D_60us = 0.000060000 * CLK_FREQ;
parameter integer D_200us = 0.000200000 * CLK_FREQ;
parameter integer D_2ms = 0.002000000 * CLK_FREQ;
parameter integer D_5ms = 0.005000000 * CLK_FREQ;
parameter integer D_100ms = 0.100000000 * CLK_FREQ;
parameter STATE00 = 5'b00000;
parameter STATE01 = 5'b00001;
parameter STATE02 = 5'b00010;
parameter STATE03 = 5'b00011;
parameter STATE04 = 5'b00100;
parameter STATE05 = 5'b00101;
parameter STATE06 = 5'b00110;
parameter STATE07 = 5'b00111;
parameter STATE08 = 5'b01000;
parameter STATE09 = 5'b01001;
parameter STATE10 = 5'b01010;
parameter STATE11 = 5'b01011;
parameter STATE12 = 5'b01100;
parameter STATE13 = 5'b01101;
parameter STATE14 = 5'b01110;
parameter STATE15 = 5'b01111;
parameter STATE16 = 5'b10000;
parameter STATE17 = 5'b10001;
parameter STATE18 = 5'b10010;
parameter STATE19 = 5'b10011;
parameter STATE20 = 5'b10100;
parameter STATE21 = 5'b10101;
parameter STATE22 = 5'b10110;
parameter STATE23 = 5'b10111;
parameter STATE24 = 5'b11000;
parameter STATE25 = 5'b11001;
input clock;
input internal_reset;
input [8:0] d_in;
input data_ready;
output rs;
output e;
output [7:0] d;
output busy_flag;
reg rs = 1'b0;
reg e = 1'b0;
reg [7:0] d = 8'b00000000;
reg busy_flag = 1'b0;
reg [4:0] state = 5'b00000;
reg start = 1'b0;
reg [23:0] count = 24'h000000;
reg counter_clear = 1'b0;
reg flag_50ns = 1'b0;
reg flag_250ns = 1'b0;
reg flag_40us = 1'b0;
reg flag_60us = 1'b0;
reg flag_200us = 1'b0;
reg flag_2ms = 1'b0;
reg flag_5ms = 1'b0;
reg flag_100ms = 1'b0;
always @(posedge clock) begin
if (data_ready) begin
start <= 1'b1;
end
if (internal_reset) begin
state <= 5'b00000;
counter_clear <= 1'b1;
start <= 1'b0;
busy_flag <= 1'b0;
end
// counter
if (counter_clear) begin
count <= 24'h000000;
counter_clear <= 1'b0;
flag_50ns <= 1'b0;
flag_250ns <= 1'b0;
flag_40us <= 1'b0;
flag_60us <= 1'b0;
flag_200us <= 1'b0;
flag_2ms <= 1'b0;
flag_5ms <= 1'b0;
flag_100ms <= 1'b0;
end
if (!counter_clear) begin
count <= count + 1;
if (count == D_50ns) begin
flag_50ns <= 1'b1;
end
if (count == D_250ns) begin
flag_250ns <= 1'b1;
end
if (count == D_40us) begin
flag_40us <= 1'b1;
end
if (count == D_60us) begin
flag_60us <= 1'b1;
end
if (count == D_200us) begin
flag_200us <= 1'b1;
end
if (count == D_2ms) begin
flag_2ms <= 1'b1;
end
if (count == D_5ms) begin
flag_5ms <= 1'b1;
end
if (count == D_100ms) begin
flag_100ms <= 1'b1;
end
end
case (state)
// Step 1 - 100ms delay after power on
STATE00: begin
busy_flag <= 1'b1; // tells other modules LCD is processing
if (flag_100ms) begin // if 100ms have elapsed
rs <= 1'b0; // pull RS low to indicate instruction
d <= 8'b00110000; // set data to Function Set instruction
counter_clear <= 1'b1; // clear the counter
state <= STATE01; // advance to the next state
end
end
// Steps 2 thru 4 raise and lower the enable pin three times to enter the
// Function Set instruction that was loaded to the databus in STATE00 above
// Step 2 - first Function Set instruction
STATE01: begin
if (flag_50ns) begin // if 50ns have elapsed (lets RS and D settle)
e <= 1'b1; // bring E high to initiate data write
counter_clear <= 1'b1; // clear the counter
state <= STATE02; // advance to the next state
end
end
STATE02: begin
if (flag_250ns) begin // if 250ns have elapsed
e <= 1'b0; // bring E low
counter_clear <= 1'b1; // clear the counter
state <= STATE03; // advance to the next state
end
end
STATE03: begin
if (flag_5ms) begin // if 5ms have elapsed
e <= 1'b1; // bring E high to initiate data write
counter_clear <= 1'b1; // clear the counter
state <= STATE04; // advance to the next state
end
end
// Step 3 - second Function Set instruction
STATE04: begin
if (flag_250ns) begin // if 250ns have elapsed
e <= 1'b0; // bring E low
counter_clear <= 1'b1; // clear the counter
state <= STATE05; // advance to the next state
end
end
STATE05: begin
if (flag_200us) begin
e <= 1'b1;
counter_clear <= 1'b1;
state <= STATE06;
end
end
// Step 4 - third and final Function Set instruction
STATE06: begin
if (flag_250ns) begin
e <= 1'b0;
counter_clear <= 1'b1;
state <= STATE07;
end
end
STATE07: begin
if (flag_200us) begin
d <= 8'b00111000; // configuration cmd = 8-bit mode, 2 lines, 5x7 font
counter_clear <= 1'b1;
state <= STATE08;
end
end
// Step 5 - enter the Configuation command
STATE08: begin
if (flag_50ns) begin
e <= 1'b1;
counter_clear <= 1'b1;
state <= STATE09;
end
end
STATE09: begin
if (flag_250ns) begin
e <= 1'b0;
counter_clear <= 1'b1;
state <= STATE10;
end
end
STATE10: begin
if (flag_60us) begin
d <= 8'b00001000; // display off
counter_clear <= 1'b1;
state <= STATE11;
end
end
// Step 6 - enter the Display Off command
STATE11: begin
if (flag_50ns) begin
e <= 1'b1;
counter_clear <= 1'b1;
state <= STATE12;
end
end
STATE12: begin
if (flag_250ns) begin
e <= 1'b0;
counter_clear <= 1'b1;
state <= STATE13;
end
end
STATE13: begin
if (flag_60us) begin
d <= 8'b00000001; // clear command
counter_clear <= 1'b1;
state <= STATE14;
end
end
// Step 7 - enter the Clear command
STATE14: begin
if (flag_50ns) begin
e <= 1'b1;
counter_clear <= 1'b1;
state <= STATE15;
end
end
STATE15: begin
if (flag_250ns) begin
e <= 1'b0;
counter_clear <= 1'b1;
state <= STATE16;
end
end
STATE16: begin
if (flag_5ms) begin // 5ms (clear command has a long cycle time)
d <= 8'b00000110; // entry mode
counter_clear <= 1'b1;
state <= STATE17;
end
end
//Step 8 - Set the Entry Mode to: cursor moves, display stands still
STATE17: begin
if (flag_50ns) begin
e <= 1'b1;
counter_clear <= 1'b1;
state <= STATE18;
end
end
STATE18: begin
if (flag_250ns) begin
e <= 1'b0;
counter_clear <= 1'b1;
state <= STATE19;
end
end
STATE19: begin
if (flag_60us) begin
d <= 8'b00001100; // Display On
counter_clear <= 1'b1;
state <= STATE20;
end
end
// Step 9 - enter the Display On command
STATE20: begin
if (flag_50ns) begin
e <= 1'b1;
counter_clear <= 1'b1;
state <= STATE21;
end
end
STATE21: begin
if (flag_250ns) begin
e <= 1'b0;
counter_clear <= 1'b1;
state <= STATE22;
end
end
STATE22: begin
if (flag_60us) begin
busy_flag <= 1'b0; // clear the busy flag
counter_clear <= 1'b1;
state <= STATE23;
end
end
// End Initialization - Start entering data.
STATE23: begin
if (start && !busy_flag) begin // wait for data
start <= 1'b0; // clear the start flag
counter_clear <= 1'b1;
rs <= d_in[8]; // read the RS value from input
d <= d_in[7:0]; // read the data value input
busy_flag <= 1'b1; // set the busy flag
end
if (flag_50ns && busy_flag) begin // if 50ns have elapsed
counter_clear <= 1'b1; // clear the counter
state <= STATE24; // advance to the next state
end
end
STATE24: begin
if (counter_clear) begin // if this is the first iteration of STATE24
e <= 1'b1; // Bring E high to initiate data write
end
else if (flag_250ns) begin // if 250ns have elapsed
counter_clear <= 1'b1; // clear the counter
state <= STATE25; // advance to the next state
end
end
STATE25: begin
if (counter_clear) begin // if this is the first iteration of STATE25
e <= 1'b0; // Bring E low
end
else if (flag_40us == 1'b1 && rs == 1'b1) begin // if data is a character and 40us has elapsed
busy_flag <= 1'b0; // clear the busy flag
counter_clear <= 1'b1; // clear the counter
state <= STATE23; // go back to STATE23 and wait for next data
end
else if (flag_2ms == 1'b1 && rs == 1'b0) begin // if data is a command and 2ms has elapsed
busy_flag <= 1'b0; // clear the busy flag
counter_clear <= 1'b1; // clear the counter
state <= STATE23; // go back to STATE23 and wait for next data
end
end
default: ;
endcase
end
endmodule
March 21, 2019
About the Author: Ralph Heymsfeld is the founder and principal of Sully Station Solutions. His interests include artificial intelligence, machine learning, robotics and embedded systems. His writings on these on other diverse topics appear regularly here and across the Internet.
Other Posts
Haar LBP and HOG - Experiments in OpenCV Object Detection
I've spent some time lately coming up-to-speed and playing with OpenCV - especially the object detection routines. Three that caught my eye for further investigation were Haar Cascades, Local Binary Patterns (LBP), and Histogram of Oriented Gradients (HOG).
iCE40 and the IceStorm Open Source FPGA Workflow
Project IceStorm is the first, and currently only, fully open source workflow for FPGA programming. Here, the software and hardware are discussed and a small sample project implemented.
LaunchPad MSP430 Assembly Language Tutorial
One of my more widely read tutorials. Uses the Texas Instruments LaunchPad with its included MSP430G2231 processor to introduce MSP430 assembly language programming.
Raspberry Pi to Arduino SPI Communication
This tutorial presents a basic framework for Raspberry Pi to Arduino communication and control using SPI - the Serial Peripheral Interface bus.
From Boole to Bits - Claude Shannon's Digital Revolution
Claude Shannon propelled computer engineering into the modern age in 1937 when he published a paper demonstrating that Boolean algebra can be applied to the design of electronic circuits to express any mathematical or logical function.
George Stibitz and the Bell Laboratories Relay Computers
In the 1940s, driven by the innovations of George Stibitz and a heritage in switching technology, Bell Telephone Laboratories produced a series of increasingly capable computers using electromagnetic relay logic circuits.
Grace Hopper - Matriarch of Programming
Grace Murray Hopper enjoyed one of the most storied careers in computing history. From the earliest days working on the Harvard Mark I, her insights and innovations helped establish the foundations for modern, user friendly computers and launched the information age.
SCELBI, Altair and the Journey to Home Computing
In the late 1960s and early 1970s a network of electronics enthusiasts galvanized around the idea of building their own computers. The hobby computers they brought to market were an entirely new class of machine that created fortunes and shaped the personal computer industry well into the twenty first century.
The Commodore VIC-20 - The Friendly Computer
Released in 1981, the Commodore VIC-20 was the best selling computer of its day. Inexpensive and user-friendly, the computer opened up the market to a new group of consumers. It was the first computer many families owned, providing the earliest exposure to computing for countless future programmers, engineers and entrepreneurs around the world.
|