Debugging random slow CPU with rPI4 and 384x256 (solved: rPi4 needs more amps than rPi3)

I’ve been struggling with this one for a while. I don’t think it’s a problem with the library because the display refresh seems solid: rPi4 hangs https://github.com/hzeller/rpi-rgb-led-matrix/issues/1177 - YouTube
However, my code randomly hangs and resumes and I’m pretty sure it’s not an issue with the code, but with the CPU throttling, although I couldn’t find anything obvious.
The code is single threaded, it generates a frame buffer, and then calls show() to copy it to this driver’s canvas:

void FastLED_RPIRGBPanel_GFX::show() {
    Framebuffer_GFX::showfps();
    for (uint16_t y = 0; y < _fbh; y++) {
	for (uint16_t x = 0; x < _fbw; x++) {
	    CRGB pixel = _fb[y*matrixWidth + x];
	    uint8_t r = pixel.r;
	    uint8_t g = pixel.g;
	    uint8_t b = pixel.b;
	    _canvas->SetPixel(x, y, r, g, b);
	}
    }
}

If you look at this video: rPi4 hangs https://github.com/hzeller/rpi-rgb-led-matrix/issues/1177 - YouTube , there are interesting bits:

the refresh seems constant enough, but the animation hangs and things hang so badly around the 12s mark that you can see the function above copying the canvas in what looks like almost 0.5s. There is no way a raw copy of data in this loop could ever take that long in such a CPU. I’m trying to understand what gets triggered to create such an issue.

I’ve put a heat sink on the CPU and GPU, maybe it’s not good enough. I’m trying to find out if the cores are going into some massive throttling for reason unknown (0.5s to do the loop above is super super slow).
The same code works flawlessly on rPI3 with a smaller display (192x160).

I’ve spent some time on this, but have come up empty so far, and was wondering if anyone has any ideas of what I should consider. I am running raspbian which I know is not ideal but the hangs seem to happen to certain patterns and not others, so they seem to depend on what the code is computing and not some random time interval from an external cronjob or daemon.

same display, same everything is running on the right in this demo: Table Mark Estes 192x160& 384x256 using FastLED_RPIRGBPanel_GFX & ArduinoOnPc-FastLED-GFX-LEDMatrix - YouTube , so the hangs are hard to pin, they happen mostly in some demos and not others, but not consistently.

When the display hangs, I see:

top - 20:05:04 up 28 days,  6:43,  3 users,  load average: 1.96, 1.77, 1.59 
Tasks: 115 total,   2 running, 113 sleeping,   0 stopped,   0 zombie 
%Cpu(s): 41.2 us,  0.5 sy,  0.0 ni, 58.1 id,  0.0 wa,  0.0 hi,  0.3 si,  0.0 st
MiB Mem :   1939.4 total,   1567.8 free,    106.3 used,    265.3 buff/cache
MiB Swap:    100.0 total,    100.0 free,      0.0 used.   1735.2 avail Mem 
 
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                  
 5548 daemon    20   0   27416   6544   3364 R 165.6   0.3  51136:35 Table_Mark_Este     

I assume that because the driver is its own thread, I should be able to use 2 cores, or 200% CPU.
I just don’t know if 165% means my code is at 100% and the driver at 65% or some other combination.

I’m not too sure how to see if the cores are being thermally throttled, but it looks like they’re not:

root@rPi4b:~# cpufreq-info   |grep GHz
  hardware limits: 600 MHz - 1.50 GHz
  available frequency steps: 600 MHz, 750 MHz, 1000 MHz, 1.50 GHz
  current policy: frequency should be within 600 MHz and 1.50 GHz.
  current CPU frequency is 1.50 GHz (asserted by call to hardware).
  cpufreq stats: 600 MHz:0.87%, 750 MHz:0.00%, 1000 MHz:0.00%, 1.50 GHz:99.12%  (453)
  hardware limits: 600 MHz - 1.50 GHz
  available frequency steps: 600 MHz, 750 MHz, 1000 MHz, 1.50 GHz
  current policy: frequency should be within 600 MHz and 1.50 GHz.
  current CPU frequency is 1.50 GHz (asserted by call to hardware).
  cpufreq stats: 600 MHz:0.87%, 750 MHz:0.00%, 1000 MHz:0.00%, 1.50 GHz:99.12%  (453)
  hardware limits: 600 MHz - 1.50 GHz
  available frequency steps: 600 MHz, 750 MHz, 1000 MHz, 1.50 GHz
  current policy: frequency should be within 600 MHz and 1.50 GHz.
  current CPU frequency is 1.50 GHz (asserted by call to hardware).
  cpufreq stats: 600 MHz:0.87%, 750 MHz:0.00%, 1000 MHz:0.00%, 1.50 GHz:99.12%  (453)
  hardware limits: 600 MHz - 1.50 GHz
  available frequency steps: 600 MHz, 750 MHz, 1000 MHz, 1.50 GHz
  current policy: frequency should be within 600 MHz and 1.50 GHz.
  current CPU frequency is 1.50 GHz (asserted by call to hardware).
  cpufreq stats: 600 MHz:0.87%, 750 MHz:0.00%, 1000 MHz:0.00%, 1.50 GHz:99.12%  (453)

I assume you removed the snd_bcm2835 driver.

welcome to the forum :slight_smile:

Good question, I checked just to make sure, and the answer is yes:

root@rPi4b:~/ArduinoOnPc-FastLED-GFX-LEDMatrix/examples# lsmod | grep snd_
snd_soc_core          200704  1 vc4
snd_compress           20480  1 snd_soc_core
snd_pcm_dmaengine      16384  1 snd_soc_core
snd_pcm                98304  3 vc4,snd_pcm_dmaengine,snd_soc_core
snd_timer              32768  1 snd_pcm
snd                    73728  4 snd_compress,snd_timer,snd_soc_core,snd_pcm
root@rPi4b:~/ArduinoOnPc-FastLED-GFX-LEDMatrix/examples# 

I just bought an rPi3 and will try on that board just in case there is something different with the kernel or compiler that causes this (my smaller picture frame uses an rPi3 and has no hangs at all, but it’s also a smaller display)

I did some tests on rPi3b. The same exact code definitely hangs less on rPi3 than rPi4, but I occasionally have a single frame where I can visually see the copy loop (framebuffer copied to the rpi-led-matrix canvas and you see the copy visually because it takes almost a second).
I’m definitely hitting something like rPi CPU throttling limit, but I’m not too sure what and how.
I’ve checked that it’s not the input voltage (if the power supply is too weak, the red power led goes off, showing the lack of power, and the CPU becomes 3 times slower).

Are you using the ethernet port? I found that performance decreases if there is a cable plugged into the port, so I only use it for initial setup and then switch to wi-fi which seems to not affect the matrix.

Good question, I’m not using ethernet, everything is generated locally. I have also confirmed that rPi3 hangs a lot less than rPi4 in my use, which is counter intuitive of course.

Less than you think, I have a 15A 5V power supply and it seems fine. Of course, I don’t go all white, or what would be a problem.

brownout (pixels look like some weird grey) and if the rPi is powered by the same power supply, it’ll reboot.

Thanks for the links. Yeah, I’m using a decent looking phone USB-C charger that’s supposed to be rated for this, but it may not be. I’ll swap the power supply (I use a different power supply for the panels so that if they go too bright, they don’t dip the voltage on the rPi)
I’ve gone back to an rPi3 which works better with the same code, so it looks like the rPi3 uses less power and therefore doesn’t throttle.

I do not have issues with signalling, I have a short cable from the electrodragon board to the panels and each panel re-amplifies the signal.

Bigger: I think I’m down to 100Hz refresh or close to it. I do have extra panels so I could make it 384x320 at the expense of refresh rate. It’ll still be just fast enough for the human eye, but it’ll be harder to get pictures/videos.

The rPi already is faster than what the panels can accept speed-wise.
The rPi4 cannot drive a longer chain than rPi3 from both my understanding and experience (on rPi4 you actually need to add a slowdown or it talks too fast and the panels can’t sync).
Or actually rPi4 may be something like 10-15% faster in resulting Hz due to timings that line up better, but definitely not 2X faster.
You can test this with an rPi3 and rPi4 without having panels by just running the command lines and seeing the Hz output given by the binary. You can’t see whether the signal would actually work, but if you have a single panel and it displays ok, that’s a good hint that the whole chain would work.

If you have data that shows differently from what I just said, please post your results :slight_smile:

I think I’m only using 2 cores, one for my code and one is used by the library, but still, it’s clear that I’m getting throttling. I definitely need to do more testing to see which one of the 2 is happening.

I believe you. I think this affects the Pi4 a lot more. That’s why going back to a Pi3 has mostly made the problem go away for me without changing anything else. Either it got too hot, or it pulled too much power and got into power save, which either isn’t happening on the Pi3 since it doesn’t have a much performance.

Ok, let’s start with my original problem. I didn’t really believe it because I was using a good quality USB-C power supply, but turns out it was low on power and causing the rPi4 to go into very slow mode, while powering a rPi3 without issues.
I used an expensive laptop 100W USB-C power supply (overkill, but testing), and I haven’t seen the big slowdowns anymore, so it wasn’t heat afterall but really just power and somehow cpufreq wasn’t able to show me that this was going on, not sure why.

On your other question: yes I update the whole framebuffer every time since it does not keep track of which pixels were updated. It happens so quickly that it’s not been an issue except when the chip gets throttled very heavily.
So yes, it was power all along, I feel silly, sorry :slight_smile: