Robbat2 (robbat2) wrote,

How deep can a bug go?

Ok, so it seems that I'm blogging again a bit, but only about software bugs and treachery. Today's post is about how I've burnt about 6 hours of development time, working through what seemed to be a simple bug.

Some background first. I'd like to switch away from an older system that's presently only used as a display head, with dual 20" LCDs on an Nvidia card. It's too old and limited to run a lot of things on directly. The new system would be both a display head and runs most apps already is a quad G5, with 12GiB of RAM. The great holdup has been in graphics. While I have access to both the stock GeForce 6600 NV43 that shipped with the machine, and an ATI X1900 G5 edition that I purchased later, my luck with graphics drivers has been less than stellar. My choices are between Nouveau, of which this tale ensues, and the competing free ATI drivers (which, at my last testing, were both still stuck, unable to read the AtomBIOS from the G5 card).

Some months ago, I filed a bug for Nouveau, that the first display output worked perfectly, but the second did not. It sat for about a month, before I got a response to go and try again. I didn't get to it until yesterday, because I was away in South Africa for two weeks, and busy with a myriad other things.

Update to the latest Nouveau and x11-drm trees yesterday afternon, and I find that it no longer even starts up a single display now. The debugging thus begins.

  1. The Device Control Block datastructure seems to have a bad data signature.
  2. Trace it. This is because the pointer to it is byteswapped.
  3. Hack in a byteswap. The signature is also byteswapped, something more central is wrong.
  4. The functions for "le{16,32}_to_cpu" seem to be broken. Just force them to byteswap for now.
  5. Update the nouveau bug again.
  6. Now we get an -ENOMEM from a memory allocation seemingly, but digging deeper, it's actually from ioctl.
  7. Update the nouveau bug again. Give up for the night.
  8. Update the x11-drm Git tree in the morning, and see that there's a fix for the ioctl stuff. Rebuild with it, and find that X will now start, but...
  9. The colors are badly swapped too! Red and Green are exchanged, and everything white is in a horrible shade of yellow.
  10. Try to initialize the second screen. "xrandr --display DVI-D-1 --right-of DVI-D-0". The taskbar expands as if it was covering both screens, but the second display is still not actually enabled.
  11. Update the nouveau bug again. Go and do other stuff.
  12. Start prodding into the nouveau driver source for the third time, looking to see about color issues. Mention this in the IRC channel. pq mentions to check that actual X_BYTE_ORDER macro.
  13. Doing a quick C program gives the correct output, however during the Nouveau compile, it's defined to _X_BYTE_ORDER (with a leading underscore), and THAT isn't defined.
  14. Look at the bugzilla again, locate a bug for the new issue.
  15. Leave a comment with some useful output on the bug, and then set out to trace it myself.
  16. Revert a Gentoo patch that removes _X_BYTE_ORDER (with the underscore) from xorg-server's
  17. Notice that _X_BYTE_ORDER is being defined to an EMPTY string now. That is NOT right.
  18. Start reading the rest of _X_BYTE_ORDER should have the value of "$ENDIAN".
  20. Absolutely great. So what the hell is the problem? Well, on powerpc, the macro returns "universal".
  21. Just what the hell is "universal"??? Well it seems that in autoconf-2.6.2, the autoconf maintainers made AC_C_BIGENDIAN take another input.
  23. So where does leave us? Up the creek with a broken autoconf I think. I haven't figured out why autoconf is telling us that we are universal yet, beyond that it's designed for the OSX universal binaries.

Hopefully I'm not tromping off into the depths of Mordor (aka the hell of autoconf M4 as maintained by vapier and flameeyes.

Tags: autoconf, gentoo
  • Post a new comment


    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded