Buddhabrot Buddhabrot
Optimize the code

If you've tried to run any of the examples so far, you'll have noticed painting a Buddhabrot is very costly in terms of CPU usage, and it takes very long. We can't avoid CPU usage -those calculations are going to get done anyway-, but we can significantly decrese execution time ( :,D).

Okay, there's two different kinds of optimizations here:

  1. A conceptual one; it breaks down to examining M's properties, and shape the code accordingly. A site that describes in deep detail M and its characteristics is Mu-Ency; it rules!
  2. And the other less intelligent one would be to produce a code that does the same stuff, but faster.
Identify and solve the problem

The key question is: "where is it we are wasting time?" And the obvious answer is: "finding Cs that belong to M". Think about it for a moment: for most Cs outside M we're just calculating a few Zs (more and more as we approach the border of M); but for Cs on the inside we've got to iterate to maxN -and they don't paint squat...

Now we know we have to weed out as many Cs as we can. Let's get on with the first optimization -it can be achieved in at least two ways:

  • A: Ignore Cs in regions we are positive they belong to M.
  • B: Look for repeating Z values in a trajectory so we can discard C. If Z repeats at less than 2 units from the origin of coordinates, it means that trajectory is looping, and so C belongs to M.

The cool thing is that both approaches can be used complementarily. The first one can be harcoded, but for the second we have to take M's measurements.

Mandelbrot
Mandelbrot regions selection
Cardioid
Optimization A

At first glance, the image looks like a cardioid shape with uncountable circumferences attached to it. Reading up mathematic studies, we verify it is really a cardioid with circumferences around.

We can use these geometric shapes to get a simplified version of the Mandelbrot. And that is precisely what I wasted my time on, borrowing data from here and there, but especially from Mu-Ency; the result is the picture below.

Black regions hold Cs to discard. I discarded the central cardioid, and the four largest circumferences. Colored regions contain, of course, the Cs to check. Every color is a region that will be separately analyzed, because calculating the geometric figures borders' positions require different parameters.

Let's make something clear: as I understand, the only perfect circumference in M is the buddha's 'head' (center at (-1, 0 i)). Those circles stuck to the cardioid are known as Mu atoms. To make life easier, in regions 2, 5, and 6, I've discarded a circle inside the corresponding Mu atom -remember this if you keep reading: the measurements I give here do not represent the exact outline of the atoms.

Regions 1 and 9 don't have many points in M compared to the rest; so we can bulk-check all Cs in there.

We discard the 'bun' on the buddha's head for region 2, that is, the circumference centered at (-1.30925, 0 i) and radius 0.05875; we ignore another circumference for region 3 the 'head', which is centered at (-1, 0 i) and has a radius of 0.25.

For the remaining 5 regions, we have to know where the cardioid is. We can describe it saying that its fixed circumference has a radius of 0.25 and is centered at the origin of coordinates.

In regions 4, 7 and 8, we just ignore the cardioid, but for region 6 we have to know where the buddha's 'hands' are: these circumferences have radius 0.094 and center at (-0.125, -0.74401 i) and (-0.125, 0.74401 i). We use these circumferences again to limit region 5.

At this point, you have to know a couple of trig formulae to find points on the circumferences' and the cardioid's borders. We have the X coordinate (C real), and we want to know Y (C img). For circumferences, we can use good ole Pythagoras:

            c1 = sqrt (h2 - c22).

But the cardioid is a whole different story: we are facing a fourth-degree equation (>,O)!! -for a given X, we can have up to 4 Ys. Luckily, the problem is actually easier tha it looks. look at the cardioid drawing. When X is between 0 and 2, there can only be 2 Ys, and one is the negative of the other. So there aren't two solutions but just 1. And it can be calculated in the following manner:

            sinAngle = ((sqrt (1 + 4 * X * a) - 1) / 2
            Y = a * (1 + sinAngle) * sqrt (1 - sinAngle2)

When X is between 0 and -0.25, the only thing we change is the calculation of sinAngle, buecause now we are expecting two solutions:

            sinAngle = ((+/- sqrt (1 + 4 * X * a) - 1) / 2

Depending on which sinAngle we use, we'll get the Y closer or farther from 0. Now, if we just change the sign for the 2 Ys we find the other 2 solutions. Freakin' easy, right?

After coding this approach for the cardioid and running a couple of tests, though, i've come to realize there's an even faster way to solve the problem, even though it's no more than an approximation. Given an angle, any one, we can find the point on the border with this formula:

            X = a * (1 - sinAngle) * sinAngle
            Y = a * (1 - sinAngle) * cosAngle

The idea is to calculate successive points by increasing the angle, until we find one close enough to the one we'd find with the other method, and use it. It seems that in this case, brute force is more efficient.

Oh yeah, before I forget: just like we ignored a few circumferences in M, we should also discard points more than 2 units away from the origin of coords.

And the final trasure -something so obvious many have overlooked it-: the Mandelbrot (and the Buddhabrot) is symetric on the real axis, so it's enough to check half the Cs, and paint trajectories on both sides of the image ( :,D).

Time to put theory in practice and type the code. There's still room for improvements, that's why some parts have ellipsis; we'll replace the later.

private float[][] getPixels(int maxN, float increm, int ladoImg) {
    double rC;
    double angulo = 3 * Math.PI / 2;
    double incremAngulo = 2 * Math.PI / 145000;
    double medioIncrem = increm / 2;
    float Cr, CrTemp, Ci, maxCi, x, y;
    int ladoImgAux = ladoImg - 1;
    float factor = ladoImgAux / 4f;
    float[][] pixels = new float[ladoImg][ladoImg];
    int limiteN;


    // Region 1
    for (Cr = -2; Cr < -1.368f; Cr += increm) {
      maxCi = (float)Math.sqrt(4 - Cr * Cr);
      for (Ci = 0; Ci < maxCi; Ci += increm) {
        ...
      }
    }

    // Region 2

    rC = 0.05875 * 0.05875;
    for (Cr = -1.368f; Cr < -1.2505; Cr += increm) {
      Ci = (float)(Math.sqrt(rC - (Cr + 1.30925) * (Cr + 1.30925)));
      maxCi = (float)(Math.sqrt(4 - Cr * Cr));
      for(; Ci < maxCi; Ci += increm) {
        ...
      }
    }

    // Region 3

    rC = 0.25 * 0.25;
    for (Cr = -1.2505; Cr < -0.75; Cr += increm) {
      if (Cr < -1.25) Ci = 0;
      else            Ci = (float)(Math.sqrt(rC - (Cr + 1) * (Cr + 1)));
      maxCi = (float)(Math.sqrt(4 - Cr * Cr));
      for(; Ci < maxCi; Ci += increm) {
        ...
      }
    }

    // Region 4

    for (CrTemp = -0.75f; Cr < -.219f; angulo += incremAngulo) {
      Cr = (float)((.5 * (1 - Math.sin(angulo)) * Math.sin(angulo)) + .25);
      if (Cr > CrTemp - medioIncrem && Cr < CrTemp + medioIncrem) {
        Ci = (float)(.5 * (1 - Math.sin(angulo)) * Math.cos(angulo));
        maxCi = (float)(Math.sqrt(4 - Cr * Cr));
        for(; Ci < maxCi; Ci += increm) {
          ...
        }
        CrTemp += increm;
      }
    }

    // Region 5
    rC = .094 * .094;
    for (Cr = -.219f; Cr < -.031; Cr += increm) {
      Ci = (float)(Math.sqrt(rC - (Cr + .125) * (Cr + .125)) + .74401);
      maxCi = (float)(Math.sqrt(4 - Cr * Cr));
      for(; Ci < maxCi; Ci += increm) {
        ...
      }
    }

    // Region 6

    rC = .094 * .094;
    for (Cr = CrTemp = -.219f; Cr < -.031; angulo += incremAngulo) {
      Cr = (float)((.5 * (1 - Math.sin(angulo)) * Math.sin(angulo)) + .25);
      if (Cr > CrTemp - medioIncrem && Cr < CrTemp + medioIncrem) {
        Ci = (float)(.5 * (1 - Math.sin(angulo)) * Math.cos(angulo));
        maxCi = (float)(-Math.sqrt(rC - (Cr + .125) * (Cr + .125)) + .74401);
        for(; Ci <= maxCi; Ci += increm) {
          ...
        }
        CrTemp += increm;
      }
    }
    
    // Region 7

    for (CrTemp = -0.031; Cr < .375 - increm; angulo += incremAngulo) {
      Cr = (float)((.5 * (1 - Math.sin(angulo)) * Math.sin(angulo)) + .25);
      if (Cr > CrTemp - medioIncrem && Cr < CrTemp + medioIncrem) {
        Ci = (float)(.5 * (1 - Math.sin(angulo)) * Math.cos(angulo));
        maxCi = (float)(Math.sqrt(4 - Cr * Cr));
        for(; Ci < maxCi; Ci += increm) {
          ....
        }
        CrTemp += increm;
      }
    }
    
    // Region 8
    for (CrTemp = .375f; Cr >= .25; angulo += incremAngulo) {
      Cr = (float)((.5 * (1 - Math.sin(angulo)) * Math.sin(angulo)) + .25);
      if (Cr > CrTemp - medioIncrem && Cr < CrTemp + medioIncrem) {
        Ci = (float)(.5 * (1 - Math.sin(angulo)) * Math.cos(angulo));
        for(; Ci >= 0; Ci -= increm) {
          ...
        }
        CrTemp -= increm;
      }
    }
    
    // Region 9
    for (Cr = .375f; Cr < 2; Cr += increm) {
      maxCi = (float)(Math.sqrt(4 - Cr * Cr));
      for (Ci = 0; Ci < maxCi; Ci += increm) {
        ...
      }
    }

    return pixels;
  }

As you can see, implementing this wasn't so tough; we could even add more regions to discard more 'buds' (Mu atoms). And if you're really suave, you can even weed out flying minibrots.

Optimization B

If we spot a repeated Z in a trajectory, we know it's looping, and will repeat ad infinitum; thus, C belongs to M. I've implemented this looking for equal points in the trajectory at a given moment.

Ok, we only have left to discuss the most straight-forward type of optimization -accelerate the code without changing the approach. Coming up is the code that was missing above (where the ellipsis are). To make this brief, I've also included the previous 'optimization B'.

        ZrAux = Zi = n = 0;
        while (n < maxN) {
          Zr = ZrAux * ZrAux - Zi * Zi + Cr;
          trayectoria[n][1] = Zi = 2 * ZrAux * Zi + Ci;
          trayectoria[n++][0] = ZrAux = Zr;
          // Optimization B

          if (n == limiteN) {
            x = trayectoria[limiteN - 1][0];
            y = trayectoria[limiteN - 1][1];
            for (j = limiteN - 2; j >= 0; j--) 
              if (x == trayectoria[j][0] && y == trayectoria[j][1]) j = -1;
            if (j == -2) break;
          }
          if (Zr * Zr + Zi * Zi > 4) {
            for (n--, i = 0; i < n; i++) {
              coordX = (int)((trayectoria[i][0] + 2) * factor);
              coordY = (int)((trayectoria[i][1] + 2) * factor);
              pixels[coordX][coordY]++;
              pixels[coordX][ladoImgAux - coordY]++;
            }
            break;
          }
        }

Let's break this down. If we compare this code to previously shown snippets, differences are obvious:

  • Fry ZiAux variable. We can live without it; alternatively, we could fry ZrAux -depends on which Z component is calculated first.
  • Several inline value assignations.
  • No conveniency method invoking to calculate Z. The less method calls, the better!
  • On painting the trajectory, ignore the last point; Since we'll always be painting pixels inside the image, we save a lot of checks. As a side effect, the Buddhabrot will appear inside a circumference, and the image corners will be black.
  • Oh, and we paint pixels on both sides of the image.

Regarding optimization B, it's pretty simple; we take the last computed point, and look for an equal one throughout the trajectory. The j value indicates if a match was found. When do we start the search to get the best results? To put it in another way, what value do we have to give limiteN? Well, it depends again on the calculated Zs and the Cs examined. On an unzoomed image, up until 5000 maxN (maybe more, I don't know), and looking at 400 Cs per pixel, limiteN is fine at half maxN; I know that much. And when maxN equals 50000, limiteN is beter off at 3000.

Mass approach for animations: BuddhaFile

When I started with the animations, I realized that, yeah, the code may be quite optimized, but to paint many frames it's still damn slow. This called for a more practical approach. Then I thought if we saved Cs to a file we wouldn't have to go through the whole selection process -just read the important values from the disk.

What's the best structure for this kind of files? I talked about saving Cs, but we're also interested in the number of iterations Z will take to escape (n) in order to skip the check for each loop. In the end, the file will hold a series of blocks with this content:

  • C-real,
  • C-imaginary, and
  • n.

To create these files, the method should look a lot like the 'getPixels()' above -use it to fill in the ellipsis.

import java.io.*;
  ...       
  public void creaBuddhArchivo(String nombreArchivo, int maxN, float increm) {
    // Initialize variables
    ...
    ObjectOutputStream flujoSalida = null;
    try {
      FileOutputStream flujo = new FileOutputStream(nombreArchivo);   
      flujoSalida            = new ObjectOutputStream(flujo); 
    } catch (StreamCorruptedException ex) { ex.printStackTrace(); 
    } catch (IOException ex)              { ex.printStackTrace(); 
    } catch (SecurityException ex)        { ex.printStackTrace(); 
    } catch (NullPointerException ex)     { ex.printStackTrace(); }
    
    // Region x

    ...      
       while (n < maxN) {
          ...
          if (Zr * Zr + Zi * Zi > 4) {
            try {
              flujoSalida.writeFloat(Cr);
flujoSalida.writeFloat(Ci);
flujoSalida.writeInt(--n); } catch (IOException ex) { ex.printStackTrace(); } break; } } ... try { flujoSalida.close(); } catch (IOException ex) { ex.printStackTrace(); } }

And when we want to paint the Buddhabrot, use this other method:

  private float[][] getPixelsBuddhArchivo(String nombreArchivo, int ladoImg) {
    int i, n, coordX, coordY, ladoAux = ladoImg - 1;
    float factor = ladoAux / 4f, Cr, Ci, Zr, Zi, ZrAux;
    float[][] pixels = new float[ladoImg][ladoImg];

    int buffered, tamBuf = 1000;
    float[][] Cs = new float[tamBuf][2];
    int[] maxNs  = new int[tamBuf];

    ObjectInputStream flujoEntrada = null;
    try {
      FileInputStream flujo = new FileInputStream(nombreArchivo);   
      flujoEntrada          = new ObjectInputStream(flujo); 
    } catch (StreamCorruptedException ex) { ex.printStackTrace(); 
    } catch (IOException ex)              { ex.printStackTrace(); 
    } catch (SecurityException ex)        { ex.printStackTrace(); 
    } catch (NullPointerException ex)     { ex.printStackTrace(); }
    
    do {
      buffered = buffer(Cs, maxNs, flujoEntrada);
      for (i = 0; i < buffered; i++) {
        Cr = Cs[i][0];
        Ci = Cs[i][1];
        ZrAux = Zi = n = 0;
        while (n++ < maxNs[i]) {
          Zr = ZrAux * ZrAux - Zi * Zi + Cr;
          Zi = 2 * ZrAux * Zi + Ci;
          ZrAux = Zr;
          coordX = (int)((Zr + 2) * factor);
          coordY = (int)((Zi + 2) * factor);
          pixels[coordX][coordY]++;
          pixels[coordX][ladoAux - coordY]++;
        }
      }
    } while (buffered == tamBuf);
    try { flujoEntrada.close(); } catch (IOException ex) { ex.printStackTrace(); }
    return pixels;
  } 

  private int buffer(float[][] Cs, int[] maxNs, ObjectInputStream flujoEntrada) {
    int pivote = 0;
    int tamanoBuffer = Cs.length;
    try {
      while (pivote < tamanoBuffer) {
        Cs[pivote][0]   = flujoEntrada.readFloat();
        Cs[pivote][1]   = flujoEntrada.readFloat();
        maxNs[pivote++] = flujoEntrada.readInt();
      }
    } catch (EOFException ex) { ex.printStackTrace(); 
    } catch (IOException ex)  { ex.printStackTrace(); }
    return pivote;
  }

In brief, 'getPixelsBuddhArchivo()' loads a buffer with a few Cs (1000 in the example) along with their corresponding Ns, and then paints. When the buffer is consumed, it gets reloaded and paints a bit more; this goes on until the whole BuddhaFile has been read.

It's a good idea to create a BuddhaFile with half the plane since we'll be painting the same on both sides of the image -this way we speed up the process, and don't fill the disk with redundant data.

The next step is to produce several images in one go, instead of reading the whole file once per frame; but I'll leave this to whoever feels like implementing this their own way.

 
Copyright © Albert Lobo Cusidó 2006-2014