

one (Hone.jl)

Last time we left off on the work with Hone.jl, a modular, object-oriented graphing library that uses meta-programming to provide an extendable platform, we had run into quite a large sum of issues. Firstly, the grids were drawing incredibly inaccurately, making them virtually useless. This also created an additional blocker in that labels cannot be added to such a grid. Our resulting plots looked something like this:

Image for post

Additionally, although the points were lining up correctly with our coordinates plotted, they were certainly not lining up with our axis and grid. This is because the origin point of Compose illustrations is not where a Data-Scientist would typically think that the origin point of their graph would be. Instead, the origin of the Y axis is actually in the top left-hand corner. As a result, a point plotted for example at (5,5) would be in the top left rather than the bottom left.

Another thing that I wanted to dramatically alter about Hone is the entire way that plots are assembled. Previously, meta-expressions would be thrown together and parsed in a long and messy meta-programming mess. I wanted to change this. The entire idea of Hone revolves around being modular. In a pinch, you should easily be able to add a different grid from somewhere else, or even run a different coordinate parser for some different results. However, with the way the scatter functions are now this is simply not possible.

修复网格 (Fixing our grids)

Probably the most prominent and horrible issues I have been facing with Hone is the grid. The grid is drawn based on divisions of measurements with the frame, which one would suppose would be pretty simple. Here’s the math:

function Grid(divisions,frame=Frame(1280,720,0mm,0mm,0mm,0mm),colorx=:lightblue,colory=:lightblue,thickness=.2)
xlen = frame.width
ylen = frame.height
division_amountx = xlen / divisions
division_amounty = ylen / divisions
total = 0
Xexpression = "(context(), "
while total < xlen
total = total + division_amountx
linedraw = Line([(0,total),(xlen,total)],:lightblue,thickness)
exp = linedraw.update(:This_symbol_means_nothing)
Xexpression = string(Xexpression,string(exp))

This is certainly code that would appear to work at first glance. First, we get the frame’s width and height and divide it by our number of divisions. After that, we iteratively add each expression for each line until we reach the width or height of the frame. We do this in a while loop, of course, which holds the control flow while the total is less than the width or height of the corresponding frame.

So what’s the issue?


I think the real problem here really goes to demonstrate how a very minor inconsistency or issue can create a really big problem in the future. The division amount for the X axis is calculated by dividing the width by the number of divisions in the grid. The problem here is that the X grid lines correspond with the lines that would represent values on the Y axis, and exactly the opposite is true for the Y lines. How could we fix this? Although it might be confusing at times, this can be fixed by simply reversing our X’s and Y’s

function Grid(divisions,frame=Frame(1280,720,0mm,0mm,0mm,0mm),colorx=:lightblue,colory=:lightblue,thickness=.2)
xlen = frame.width
ylen = frame.height
division_amountx = ylen / divisions
division_amounty = xlen / divisions
total = 0
Xexpression = "(context(), "
while total < xlen
total = total + division_amountx
linedraw = Line([(0,total),(xlen,total)],:lightblue,thickness)
exp = linedraw.update(:This_symbol_means_nothing)
Xexpression = string(Xexpression,string(exp))

Likewise, we will need to follow this idea when we approach the challenge of creating labels for the grid. All that is changed in this example is that the ylen is divided in the division_amount X, and the xlen is divided in the division_amount Y. As a result, we have a beautiful grid that looks like this:

Image for post

网格标签 (GridLabels)

The next type that I wanted to create, now that we have working Grids and Label types is of course labels for the X and Y axis. We will start with parameters. We have two choices for getting grid positions, we could either

  • save them in the Grid type

  • or calculate them again.


While saving them into the grid type might give us some ease of use in terms of parameters, I believe it will make it much more difficult to implement more custom features in the future. Part of the problem with doing something like that is that GridLabels isn’t meant to be a wrap-up function that combines many elements into a frame, but rather a feature that can be added or removed from a frame. In that regard, it is more similar to many of the methods that fall into the HDraw.jl file, rather than the HPlot.jl file. Our Grid Labels are also going to need our x and y values in order to generate the labels themselves.

function GridLabels(x,y,grid,buffer=20)
frame = grid.frame
xvals = grid.xvals
yvals = grid.yvals
topy = maximum(y)
topx = maximum(x)
tag = "(context(), "

We can obtain the frame, which we will need for some calculations from the grid type. In my original version of this function, I decided to try and store the values for the grid’s positions in the type:

for value in xvals
lbl = value / frame.height * topy
grlbl = Label(string(lbl),buffer - 5, value)
tag = string(tag,grlbl.tag)
for value in yvals
lbl = value / frame.width * topx
grlbl = Label(string(lbl),value, buffer - 5)
tag = string(tag,grlbl.tag)
tag = string(tag,")),")
composition = Meta.parse(tag)
show() = eval(composition)

This was done by getting the percentage of the value in reference to the frame’s height, which the grid’s mathematics are based off of. It’s important to remember that ever object’s position in Hone is directly proportional to the frame on top of it, just like in most graphics work. We then multiply this percentage by the highest value in topx. In the example of having four grid-lines, we would then have .25 incrementally up to 1 each multiplied by the maximum value in our data for the labels, and the highest value in our resolution for our position. However, I found that storing data like this in the Grid type created a lot more problems than solutions.

Image for post

In my second revision, however, I revised that idea and created something a little better:


function GridLabels(x,y,grid,label = "X",buffer=20)
frame = grid.frame
divamountx = grid.division_amountx
total = divamountx
topx = maximum(x)
topy = maximum(y)
xlabels = []
while total < (divamountx * grid.divisions)
percentage = frame.height / total
curr_label = topx * percentage
push!(xlabels,(curr_label, total))
total += divamountx
xtags = ""
for (key,data) in xlabels
textlabel = Label(string(round(key)), 40 , data, "", 3)
xtags = string(xtags, textlabel.tag)
tag = xtags

The final product will be revealed after I explain my frustration with one more issue with the Hone library…


非模块化绘图 (Non-Modular Plotting)

If you’ve ever taken a small peek at the scatter functions in Hone, and understood that some of the code in them were directly contradictory to the methodology behind Hone, then you most certainly weren’t alone. Rather than having functions to do things like parse coordinates or add lines to plots, I thought it would be a brilliant idea to instead hard-code snippets into functions one at a time to do so. Visit exhibit A:

The _arrayscatter function.


function _arrayscatter(x,y,shape=Circle(.5,.5,25),
features = [Grid(3), Axis(:X), Axis(:Y)],
buffer = 90)
fheight = frame.height - buffer
fwidth = frame.width - buffer
topx = maximum(x)
topy = maximum(y)
expression = string("")
# Coordinate parsing -------
for (i, w) in zip(x, y)
inputx = (i / topx) * fwidth
inputy = (w / topy) * fheight
exp = shape.update(inputx,inputy)
expression = string(expression,string(exp));
points = transfertype(expression);
for feature in features
composition = eval(expression);
show() =
tree() = introspect(composition)
save(name) = draw(SVG(name), composition);
get_frame() = frame
add(obj) = frame.add(obj)

Not only is this function a little ugly, it requires its own unique type in order to transmit its own tag into the frame. Rather than being a collection of objects generated automatically with inputs, it is itself some objects with a collection of objects serving alongside it. In the context of Hone.jl, and what the very idea of Hone.jl is; this just doesn’t make any sense. My first action took against such a thing was to create an entirely seperate function to create Axis types:

function Axis(orientation=:X, axiscolor = :gray, frame=Frame(1280,720,0mm,0mm,0mm,0mm), buffer = 90)
if orientation == :X
pairs = [(buffer,frame.height - buffer), (frame.width,frame.height - buffer)]
else orientation == :Y
pairs = [(buffer,0),(buffer, frame.height - buffer)]
axis = Line(pairs,axiscolor)
tag = axis.update([pairs])

That was relatively straightforward! Next, I decided to do the same thing with the coordinates in order to slim down the function even further.

function Points(x, y, frame=Frame(1280,720,0mm,0mm,0mm,0mm), buffer = 90, shape=Circle(.5, .5, 25))
fheight = frame.height - buffer
fwidth = frame.width - buffer
topx = maximum(x)
topy = maximum(y)
tag = string("")
# Coordinate parsing -------
for (i, w) in zip(x, y)
inputx = (i / topx) * fwidth
inputy = (w / topy) * fheight
exp = shape.update(inputx,inputy)
tag = string(tag,string(exp))
tag = string(tag)
show() = eval(Meta.parse(string("compose(context(), ", tag,")")))
(var) -> (tag)

After all of that copy and pasting and slight refactoring, we have a cute little function that looks like this:


function _arrayscatter(x, y,
features = [Grid(3), Axis(:X), Axis(:Y)],
buffer = 90,
points = Points(x, y, frame, buffer, shape)
for feature in features
show() =
tree() = introspect(composition)
save(name) = draw(SVG(name), composition);
get_frame() = frame
add(obj) = frame.add(obj)

I like this because it focuses a lot more on the objects as objects and the frame as a holder for the objects, rather than the entire array scatter being our composition of objects smashed into meta-expressions with other types alongside it. With these solutions, I did also notice another issue revealing itself. If you recall, the origin point for Compose is most certainly not where you would expect it to be. Rather than being in the bottom left, it is in the top left. As a result, many of our y coordinates are reversed, and thus do not match up with our new grid labels.

To fix this, I simply subtracted our new value by the height and used it as our input y.


function Points(x, y, frame=Frame(1280,720,0mm,0mm,0mm,0mm), buffer = 90, shape=Circle(.5, .5, 25))
fheight = frame.height - buffer
fwidth = frame.width - buffer
topx = maximum(x)
topy = maximum(y)
express = string("")
# Coordinate parsing -------
for (i, w) in zip(x, y)
inputx = (i / topx) * fwidth
inputy = (w / topy) * fheightinputy = fheight - inputy
exp = shape.update(inputx,inputy)
express = string(express,string(exp))
tag = express
show() = eval(Meta.parse(string("compose(context(), ", tag,")")))

结论 (Conclusion)

After all of this great work done to the Hone library, I am excited to reveal to you the new default plot as of version 0.0.5!


Image for post

In my personal opinion, the improvement here is beyond dramatic. Firstly, we have data accurately represented, not upside-down, not in the wrong place. The X and Y for this visualization are both [5,10,15,20], which is precisely what the plot reads. On top of that, we can actually see where we are in location around the actual data with our new straight grid and grid-labels!

I am incredibly excited with where Hone.jl is now, and where it is going to be very shortly! There is a lot of work to do, but I am hoping that the base of the graphing architecture is very near being done, and all of my work-around bug fixes aren’t going to need to exist anymore! (In a perfect world.) Regardless of the occasional difficulty, I have learned a lot from this project and I am very excited to see its effectiveness in mine as well as other’s usage of it!

