Step-by-Step Guide to Building a Custom PLD-Based UART in VHDL
Why build your own UART today? Because the built‑in peripherals in many cheap development boards are either hidden behind proprietary libraries or simply don’t match the exact baud rate you need. A small, custom UART on a PLD (CPLD or low‑cost FPGA) gives you full control, teaches you the inner workings of serial communication, and leaves you with a reusable block for future projects. In this post I’ll walk you through the whole process – from defining the interface to getting a working bit on the board – using plain VHDL and a few practical tips from my own lab bench.
What a UART Actually Does
A UART (Universal Asynchronous Receiver/Transmitter) is a tiny state machine that turns parallel data from your logic into a serial stream, and vice‑versa. It adds a start bit, optional parity, and one or more stop bits so the receiver can tell where each byte begins and ends. The key timing element is the baud rate – the number of bits sent per second. All the rest is just moving bits in and out at the right moments.
Choose Your PLD and Toolchain
I usually start with a low‑cost Xilinx CPLD (e.g., XC9500) or a small Spartan‑6 FPGA. Both have enough logic cells for a UART and are supported by the free Vivado or ISE tools. The steps below are tool‑agnostic; just replace the project creation commands with the ones you use.
Step 1: Define the UART Interface
Create a new VHDL file called uart.vhd. The entity should expose the classic signals:
entity uart is
generic (
CLK_FREQ : integer := 50_000_000; -- system clock in Hz
BAUD_RATE : integer := 115200
);
port (
clk : in std_logic;
rst_n : in std_logic;
tx_data : in std_logic_vector(7 downto 0);
tx_start : in std_logic;
tx_busy : out std_logic;
rx_data : out std_logic_vector(7 downto 0);
rx_ready : out std_logic;
rx_error : out std_logic;
uart_tx : out std_logic;
uart_rx : in std_logic
);
end uart;
Notice the use of generic parameters for clock frequency and baud rate. This lets you reuse the same code on a 100 MHz board or a 25 MHz board without touching the logic.
Step 2: Build a Baud‑Rate Generator
The UART needs a tick that occurs once per bit period. The simplest way is a counter that divides the system clock down to the baud rate.
architecture rtl of uart is
constant DIVISOR : integer := CLK_FREQ / BAUD_RATE;
signal baud_tick : std_logic;
signal baud_cnt : integer range 0 to DIVISOR-1 := 0;
begin
process(clk, rst_n)
begin
if rst_n = '0' then
baud_cnt <= 0;
baud_tick <= '0';
elsif rising_edge(clk) then
if baud_cnt = DIVISOR-1 then
baud_cnt <= 0;
baud_tick <= '1';
else
baud_cnt <= baud_cnt + 1;
baud_tick <= '0';
end if;
end if;
end process;
The baud_tick signal goes high for one system clock cycle at the start of each bit period. I like to keep it a single‑cycle pulse – it makes the state machines that follow much cleaner.
Step 3: Transmitter State Machine
The transmitter shifts out the start bit, eight data bits, and a stop bit. A simple three‑state machine (IDLE, SEND, STOP) does the job.
type tx_state_type is (TX_IDLE, TX_SEND, TX_STOP);
signal tx_state : tx_state_type := TX_IDLE;
signal tx_shift : std_logic_vector(9 downto 0);
signal tx_bitcnt : integer range 0 to 9 := 0;
uart_tx <= '1'; -- idle line is high
process(clk, rst_n)
begin
if rst_n = '0' then
tx_state <= TX_IDLE;
tx_busy <= '0';
tx_shift <= (others => '1');
tx_bitcnt <= 0;
elsif rising_edge(clk) then
case tx_state is
when TX_IDLE =>
tx_busy <= '0';
if tx_start = '1' then
tx_shift <= '0' & tx_data & '1'; -- start, data, stop
tx_bitcnt <= 0;
tx_state <= TX_SEND;
tx_busy <= '1';
end if;
when TX_SEND =>
if baud_tick = '1' then
uart_tx <= tx_shift(0);
tx_shift <= '1' & tx_shift(9 downto 1);
if tx_bitcnt = 9 then
tx_state <= TX_STOP;
else
tx_bitcnt <= tx_bitcnt + 1;
end if;
end if;
when TX_STOP =>
if baud_tick = '1' then
uart_tx <= '1';
tx_state <= TX_IDLE;
end if;
end case;
end if;
end process;
A quick anecdote: the first time I wrote this block on a CPLD, I forgot to set the idle line to ‘1’. The result was a constantly low line that looked like a stuck key on my terminal. A good reminder that UART is “idle high” – a tiny detail that can save hours of debugging.
Step 4: Receiver State Machine
Receiving is a bit trickier because you must detect the start bit, then sample the data bits in the middle of each bit period. The classic approach is to use a counter that waits half a baud tick after seeing a falling edge, then samples every full tick.
type rx_state_type is (RX_IDLE, RX_START, RX_DATA, RX_STOP);
signal rx_state : rx_state_type := RX_IDLE;
signal rx_shift : std_logic_vector(7 downto 0);
signal rx_bitcnt : integer range 0 to 7 := 0;
signal sample_cnt : integer range 0 to DIVISOR-1 := 0;
signal sample_en : std_logic := '0';
process(clk, rst_n)
begin
if rst_n = '0' then
rx_state <= RX_IDLE;
rx_ready <= '0';
rx_error <= '0';
sample_cnt <= 0;
elsif rising_edge(clk) then
case rx_state is
when RX_IDLE =>
rx_ready <= '0';
if uart_rx = '0' then -- start bit detected
sample_cnt <= DIVISOR/2; -- wait half bit
rx_state <= RX_START;
end if;
when RX_START =>
if baud_tick = '1' then
if uart_rx = '0' then -- still low, valid start
sample_cnt <= 0;
rx_bitcnt <= 0;
rx_state <= RX_DATA;
else
rx_state <= RX_IDLE; -- false start
end if;
end if;
when RX_DATA =>
if baud_tick = '1' then
rx_shift(rx_bitcnt) <= uart_rx;
if rx_bitcnt = 7 then
rx_state <= RX_STOP;
else
rx_bitcnt <= rx_bitcnt + 1;
end if;
end if;
when RX_STOP =>
if baud_tick = '1' then
if uart_rx = '1' then
rx_data <= rx_shift;
rx_ready <= '1';
rx_error <= '0';
else
rx_error <= '1'; -- missing stop bit
end if;
rx_state <= RX_IDLE;
end if;
end case;
end if;
end process;
The receiver sets rx_ready high for one clock cycle when a full byte is captured. In a larger design you would typically latch this into a FIFO, but for a simple demo the flag is enough.
Step 5: Top‑Level Integration
Now wrap the UART into a top‑level module that connects to the PLD pins and a simple testbench. I like to expose a single tx_data register and a tx_start pulse, plus a rx_data output that the rest of the design can read.
entity uart_top is
port (
clk : in std_logic;
rst_n : in std_logic;
tx_data : in std_logic_vector(7 downto 0);
tx_start : in std_logic;
rx_data : out std_logic_vector(7 downto 0);
rx_ready : out std_logic;
uart_tx : out std_logic;
uart_rx : in std_logic
);
end uart_top;
architecture rtl of uart_top is
begin
u_uart : entity work.uart
generic map (
CLK_FREQ => 50_000_000,
BAUD_RATE => 115200
)
port map (
clk => clk,
rst_n => rst_n,
tx_data => tx_data,
tx_start => tx_start,
tx_busy => open,
rx_data => rx_data,
rx_ready => rx_ready,
rx_error => open,
uart_tx => uart_tx,
uart_rx => uart_rx
);
end rtl;
Compile the design, run a quick behavioral simulation (I use ModelSim), and verify that a transmitted byte appears on rx_data after the expected number of clock cycles. If the simulation looks good, move on to synthesis.
Step 6: Synthesize and Assign Pins
In Vivado, create a new project, add the VHDL files, and set the target device to your CPLD/FPGA. Run synthesis – the UART uses only a few hundred LUTs, so even the smallest CPLD can handle it.
Next, open the I/O Planning view and assign uart_tx and uart_rx to the physical pins you will connect to a USB‑to‑TTL adapter or a simple LED‑based loopback. Remember to set the I/O standard to LVCMOS33 (or whatever your board uses).
Step 7: Load and Test on Hardware
Program the PLD with the generated bitstream. Hook up a USB‑to‑TTL cable: connect the cable’s TX to uart_rx and the cable’s RX to uart_tx. Open a terminal program (e.g., PuTTY) at 115200 8N1. When you press a key, the terminal sends a byte; the PLD receives it, asserts rx_ready, and you can echo it back by asserting tx_start with the same rx_data. You should see the character appear twice – once from the host and once from the PLD.
If you encounter framing errors (rx_error high), double‑check the baud‑rate divisor and make sure the clock frequency matches the CLK_FREQ generic. A common mistake is forgetting to account for the PLL’s multiplication factor when you generate a 100 MHz clock from a 50 MHz crystal.
Step 8: Extend the Design
Now that the core UART works, you can add features:
- Parity – add a single parity bit in the shift registers.
- FIFO buffers – smooth out bursts of data.
- Multiple baud rates – expose a selector that changes the divisor on the fly.
- Hardware flow control – RTS/CTS lines for high‑speed links.
Each addition follows the same pattern: define a clear state machine, keep the code modular, and test with a small simulation before loading to hardware.
Wrap‑Up Thoughts
Building a UART from scratch is a great way to demystify serial communication and to get comfortable with VHDL state machines. The whole project fits comfortably on a low‑cost CPLD, leaving plenty of room for other logic you might need – say, a simple SPI flash controller or a tiny soft‑core CPU. The next time you need a custom baud rate or a special framing format, you’ll already have a tested block you can drop into any PLD design.
Happy coding, and may your bitstreams always synthesize on the first try!
- → Troubleshooting Common UART Communication Errors on Raspberry Pi: A DIY Checklist @serialcablechronicles
- → Step‑by‑Step Guide to Building a Custom UART Interface for Your FDM 3D Printer @3dprintlab
- → How to Implement a High‑Performance UART on a Low‑Cost FPGA Board @fpgainsights
- → Migrating Legacy VHDL Code to Modern CPLDs: A Step-by-Step Tutorial @cpldinsights
- → Design an 8-to-1 Multiplexer in VHDL: A Step-by-Step Tutorial for Beginners @muxinsights