A little note to myself on “why graph models” for data science.
Machine learning is dominated by statistical models. And that’s for a good reason, they perform quite well. Amazingly well, to be honest.
Graph models don’t get much attention, except maybe in the Neo4j community and for fraud detection. Here are 3 points where they shine over statistical models.
1 — It’s difficult for a statistical model to use signals with a small sample. A single phone number is associated with a few people. …
CRUD is a very well known acronym for software. It stands for Create, Read, Update, and Delete and they are the basic operations you need in any system that stores data.
When talking about user-interface, the “read” part is usually broken in two: list the records, and show a detailed view of each one. As far as I know, Rails popularized that convention, and several frameworks have adopted it.
So, this is a note for me, but it might be useful for others: let’s use “CUDLS” (cuddles?) as an acronym for that interface operations and let “CRUD” stand for the storage operations.
Spacemacs has what we love from VI, but with batteries included and with sane defaults.
The idea here is to show some concepts to help understand why a lot of people like Spacemacs and VI editing style. Then give you some helpful commands to play with.
So, here are the two concepts to understand why we ❤️ Spacemacs:
I’ve been toying with Scheme for quite some years, first with Racket and recently with Guile. I was never a heavy user of them, but I learned a trick here and there. It was not until today I realized that Scheme is the “Linux” of the programming languages (I’ll explain that soon).
Pimp my Scheme!
I was just playing again with the excellent book “Maze for Programmers” from Jamis Buck (ten minutes ago, to be honest) when I commented to my wife:
_ You know what? I don’t have fun like this since I learned Ruby for the first time.
_ Hmmm… So why don’t you write about it? …
My English was never great, but as I’m going to apply to the IELTS I need to polish all rough edges and, of course, learn more.
One thing that bothers me is the use of prepositions, I never get that right: “on”, “in”, and “at” are a big source of confusion, as “to”, “for”, and “by”. So, I wanted to trick my future self into learning those as I know he’s lazy. The solution? To create a small game I can play anytime because the boring task becomes fun.
I created a bookmarklet (yeah, yeah, bookmarklets are dead, I know!) that replaces all the prepositions in any web page to select boxes with options, as you choose the option, it becomes green or red to indicate if you got that right or…
PCA is a technique used mainly with 3 goals:
It’s not difficult to find articles about it all over the internet, but I’m struggling to get an intuitive understanding of PCA. Almost everything I found tries to explain the linear algebra behind it, but don’t give too much insight (at least for my non-mathematical mind).
So, here I try to understand what it does and to give a non-mathematical explanation. Of course, I’ll commit several sins doing that and also I’ll tell some “half-truths”, but bear with me, it’s going to be useful. …
I was trying to get a better understanding of the “C” regularization term in SVM with a Gaussian kernel. When I tried some low values, I got VERY weird results (wait for it…)
Here is an animated gif with the problem:
The gif shows the behavior of the SVC using “predict” and “decision_function” methods. The background is the prediction and the points are the training values. So, what’s wrong?
Programming languages vary a lot, but some things are always difficult, one of these is to name variables and functions.
You probably follow some kind of code convention, like the ones for Ruby, Python, Scala, or Haskell. However, they seldom offer advice about which names to use. Here I offer you some rules I follow and a list of my preferred words (Warning: some are less than… ahem… conventional, but they work really fine).
In the last post, I explained very quickly how pattern matching works in Racket and how to create new patterns.
However, there is a little boilerplate code that bothers me. It’s a “macro-thing” that’s not directly related create new patterns. So, I create a macro to remove this macro from the view. Err … something tells me that’s not the sanest approach, but … =/
Here is a new pattern to match things nested in lists:
My macro is simple(istic), just an extra line and some generalization:
And it can be used like this:
This macro is a nested one, i.e., it’s a macro that generates another macro. It’s not a problem for Racket (it can handle that easily), but it IS a little weird for human minds. …
One of best features in Python (IMHO) is the “generator” thing.
Here’s a simple example of a generator in action:
The function produces values and handles it to a “for loop” as a collection. The obvious use is to avoid creating whole arrays and fill up memory.
I like them for another reason; they allow me to remove nesting. Recently I created a simple CSV from a query in Neo4J. The code goes like this:
Not pretty, the CSV (consumer) and Neo4J (producer) logics are interleaved. Generators to the rescue:
I find this better from a reuse perspective. The logic from producer and consumer are separated now. The producer function can generate records for any use. Also, I can use the write_to_csv function to write any collection of hash-like things to csv. …