Streaming Video to a Pico 1 with MicroPython

📅 8th June 2025 🔖 iot ⏲️ 6 minutes to read

I've been doing a bit of graphics experimentation recently with a Pimoroni Pico Display Pack 2.0 attached to a Raspberry Pi Pico W (with the RP2040 microcontroller). I got interested in rendering sprites, and decided to see how quickly you can actually get pixels on screen. This lead me to exploring the different framebuffer formats, and how to get bytes from the network onto the screen with MicroPython.

Scroll to the bottom for the result - streaming video at 10FPS over the network to the Pico!

Pico Graphics

Pimoroni offer a basic 2D drawing library called Pico Graphics. It provides drivers for the various screens they offer, providing 2D drawing primitives and also text support. There are a few ways to initialize the screen, each with tradeoffs of memory vs color palette:

Pen	Palette	Colors	Depth
PEN_P4	Manual	16	4 bits
PEN_P8	Manual	256	8 bit
PEN_RGB332	Preset	256	8 bits
PEN_RGB565	Preset	64K	16 bits
PEN_RGB888	Preset	16M	24 bits

You can calculate the memory usage based on the required size of the frame buffer. In my case, with a 320 x 240 pixel screen:

Pen	Formula	Buffer
PEN_P4	320 * 240 * 4 / 8 / 1000	38.4 kilobytes
PEN_P8	320 * 240 * 8 / 8 / 1000	76.8 kilobytes
PEN_RGB332	320 * 240 * 8 / 8 / 1000	76.8 kilobytes
PEN_RGB565	320 * 240 * 16 / 8 / 1000	153.6 kilobytes
PEN_RGB888	320 * 240 * 32 / 8 / 1000	307.2 kilobytes

It's worth noting that the RP2040 only has 264 KB SRAM, meaning PEN_RGB888 is out of the question. The Dynamic color palettes are interesting but not easy to work with if we're talking about video, unless you want to analyse each frame and tailor a new palette. Perhaps that's another post.

Drawing Pixels

What's the fastest way to change the pixels for the entire screen? Let's analyse some different techniques. I'll be using the following very simple stopwatch class to measure performance:

import utime

class Stopwatch:
    def restart(self):
        self.start_time = utime.ticks_us()
    
    def elapsed_ms(self):
        return utime.ticks_diff(utime.ticks_us(), self.start_time) / 1000

Baseline Update

Let's start by testing how fast the framebuffer can be presented to the screen; everything else will be slower than this because it will involve more work.

import picographics
import random
from stopwatch import Stopwatch

display = picographics.PicoGraphics(display=picographics.DISPLAY_PICO_DISPLAY_2, pen_type=picographics.PEN_RGB332)

display_width, display_height = display.get_bounds()

sw = Stopwatch()

while True:
    sw.restart()
    display.update()
    print(sw.elapsed_ms())

With PEN_RGB332 this takes 34 milliseconds, and with PEN_RGB565 it takes 24 milliseconds. I think these (on the face of it counter-intuitive) results are because the screen itself is using the 24-bit RGB888 format, so more processing is needed to get from e.g. 8 bits or 16 bits to 24 bits (however I did not have time to prove/disprove this hypothesis).

Using Clear

Here we'll set the color at random, and tell the screen to clear. This isn't useful for rendering video because we can only use the same color for every pixel, but it does give us an idea of how efficiently native code can update all pixels in the framebuffer.

import picographics
import random
from stopwatch import Stopwatch

display = picographics.PicoGraphics(display=picographics.DISPLAY_PICO_DISPLAY_2, pen_type=picographics.PEN_RGB332)

display_width, display_height = display.get_bounds()

sw = Stopwatch()

while True:
    sw.restart()
    display.set_pen(display.create_pen(random.randint(0, 255), random.randint(0, 255), random.randint(0, 255)))
    display.clear()
    display.update()
    print(sw.elapsed_ms())

For PEN_RGB332 on a RP2040, the above code reports around 40ms (25 FPS), and for PEN_RGB565, it reports 30ms (33 FPS).

The overhead for setting the pen is negligable - in general around 5ms goes to the clear method, and the remaining time is spent on update. Replacing the call to clear with rectangle yields exactly the same results as you might expect:

# display.clear()
display.rectangle(0, 0, display_width, display_height)

Setting Pixels with Pico Graphics

What happens if we iterate all pixels on the screen and set them individually? The documentation states that this is slow - but how does it compare? We'll change our render loop to the following:

while True:
    sw.restart()
    display.set_pen(display.create_pen(random.randint(0, 255), random.randint(0, 255), random.randint(0, 255)))
    for x in range(0, display_width):
        for y in range(0, display_height):
            display.pixel(x, y)
    display.update()
    print(sw.elapsed_ms())

For For PEN_RGB332 on a RP2040, the above code reports around 1.63 seconds to render a frame, and for PEN_RGB565 it reports 1.62 seconds. We're able to set each pixel independently, but it's far too slow for watching video.

Setting Pixels in the Framebuffer

Pico Graphics allows us to access the framebuffer by wrapping the Pico Graphics instance in a memoryview. This can be treated as a byte array and manipulated directly. Let's see what happens if we access the framebuffer directly, and manipulate each byte in the same manner:

fb = memoryview(display)

while True:
    sw.restart()
    byte = random.randint(0, 255)
    for i in range(0, len(fb)):
        fb[i] = byte
    display.update()
    print(sw.elapsed_ms())

For For PEN_RGB332 on a RP2040, the above code reports around 996 milliseconds to render a frame, and for PEN_RGB565 it reports 1.94 seconds. Since we're operating directly on bytes in the framebuffer, it makes sense that double the bytes means double the time.

We're getting warmer: it's clear that setting pixels is too slow, however we do it. So, what if we were to copy a pre-computed buffer directly?

Reading from a File

The following ffmpeg command will generate a buffer from an image with the pixel format we specify:

ffmpeg -i <input image> -vf "scale=320:240" -f rawvideo -pix_fmt <pixel format> frame.bin

Pico Graphics Pen	ffmpeg -pix_fmt	Example
PEN_RGB332	rgb8	link
PEN_RGB565	rgb565be	link
PEN_RGB888	rgb24	-

The full set of pixel formats supported by ffmpeg can be shown with ffmpeg -pix_fmts. Here's how we tweak the render loop to read directly from a file into the framebuffer:

fb = memoryview(display)
f = open('frame.bin', 'rb')

while True:
    sw.restart()
    f.seek(0)
    f.readinto(fb)
    display.update()
    print(sw.elapsed_ms())

With PEN_RGB332 we get around 45ms to perform the update, and, interestingly, we see the same time for a buffer 2x the size in PEN_RGB565 format. It seems that the I/O is fast and the bottleneck is the drawing. So, what if instead, we do the same thing using a TCP stream, over the network?

Reading from a TCP Socket

The following ffmpeg command will open a TCP socket and stream the specified video at 10 FPS:

ffmpeg -re -i <input video> -vf "scale=320:240,fps=10:round=up" -f rawvideo -pix_fmt rgb8 tcp://0.0.0.0:2000\?listen

See the amended code, to read the TCP stream directly into the framebuffer:

import picographics
import socket

display = picographics.PicoGraphics(display=picographics.DISPLAY_PICO_DISPLAY_2, pen_type=picographics.PEN_RGB332)

sockaddr = socket.getaddrinfo('<ip address>', 2000)[0][-1]
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect(sockaddr)

fb = memoryview(display)

while True:
    sock.readinto(fb)
    display.update()

Preview

Below is the 10 FPS result with Big Buck Bunny, filmed on a phone (which makes it look a lot worse than it actually is).

The limiting factor here is the available WiFi bandwidth, considering we're transmitting raw video frames at about 6mbits/s. However compression adds compute requirements which are hard to meet on a microcontroller. Still, this is pretty good for a quick view of something like a CCTV or doorbell camera- since the stream in this case is being generated by ffmpeg, you can provide whatever you like as a source.

🏷️ pen pico screen framebuffer pixels video graphics bits stream reports rp2040 byte buffer pixel kilobytes

Alan Edwardes