ElasticSearch: Features you will love

If you need effective and fast full-text search for your app, why don’t you give a try to ElasticSearch dB?
Setting up Elastic is easy and quick and in a couple of moments, you are ready to explore your data. Just define an index, a type and its mapping and index your data!
At ICAN Future Star, we use ElasticSearch to store and retrieve our data. We maintain thousands of documents containing university and courses, which is mainly text along with some numeric data. Searching in this amount of information wouldn’t be an easy task without the searching options of ElasticSearch.
Using Elastic for the past three months, I am every day more surprised by its capabilities. Here are some of the functionalities that I loved in it.

Complex bool queries

Although complex bool queries can be complicated if you use it for the first time, you will love them once you get familiar with them.
Taking an example from our own app, let’s say we want to search for documents in our database satisfying the following query:
('engineering' and 'civil') or ('economics' or 'finance')
Starting from the ('engineering' and 'civil') part, we create a bool query using the must keyword. Putting our two queries in an array, we are done with the and part of our query. For the ('economics' and 'finance'), we use again a bool query but this time we choose the should keyword for covering the or condition. The minimum_should_match parameter gives us the ability to define how many of the encapsulated conditions should be matched in the query. Just setting it to one satisfies our case. The two sub-queries we have constructed are now just combined in a should query to satisfy the or case.
This is the result of the query:
{
    "bool": {
        "should": [{
            "bool": {
                "must": [{
                    "match": {
                        "title": "engineering"
                    }
                }, {
                    "match": {
                        "title": "civil"
                    }
                }]
            }
        }, {
            "bool": {
                "should": [{
                    "match": {
                        "title": "economics"
                    }
                }, {
                    "match": {
                        "title": "finance"
                    }
                }],
                "minimum_should_match": 1
            }
        }],
        "minimum_should_match": 1
    }
}

Multi-fields

During the development of our software, we had to do with the issue of needing to use a field for two different purposes. For example, we wanted a field to be mapped as analyzed for full-text search but at the same time as non-analyzed to search for the exact value of the field.
Elastic provides the multi-fields functionality, with which you can define multiple mappings for a single field.
Here is the example of our mapping for the category field in one of our documents:
{ 
    "mappings":  {  
        "document_type":  {   
            "properties":  {    
                "category":  {
                    "type":   "text",
                         "analyzer": "custom analyzer",
                         "fields":  {   
                        "exact":  {        
                            "type":   "keyword"      
                        }     
                    }    
                }   
            }  
        } 
    }
}
Using this functionality, we keep two different versions of the same field: one analyzed with our custom analyzer and the exact version that is being handled as a keyword.
Now, using the first version of the field in the following query:
"query":  {  
    "match":  {   
        "category":   "engineering"   
    } 
}
the query value will be analyzed by the custom_analyzer and it will return the documents matching the resulting tokens.
On the other hand, using the exact version of the category field:
"query":  {  
    "match":  {   
        "category.exact":   "engineering"   
    } 
}
we will get just the documents that explicitly match the requested category.

Reindex API

After defining a specific mapping and loading the data into Elastic, it is a common need to change the type of a field or adding a new field. Although adding a new field to an existing mapping is straightforward, modifying a field mapping is impossible.
In this case, the only solution is to define a new mapping and index the data again.
Here comes the Reindex API, a convenient way to copy the data from an old index to a new one. It is easy to use and it simply copies the documents from an index to another indexing them according to the new mapping. An example of use:
POST _reindex
{
  "source": {
    "index": "old_index"
  },
  "dest": {
    "index": "new_index"
  }
}

Function Score in a Parent-Child relationship

In our database, we have documents that are connected through the Parent-Child relationship. It is very often in our case the need to sort the child documents according to the parent documents. Although elastic does not provide the option to sort the child documents according to parent fields with the regural sort options.
The solution, in this case is to use a function_score in combination with the has_parent query. By using the
By using the doc notation, one can access the fields of the parent documents.
{
    "query": {
        "has_parent" : {
            "parent_type" : "parent",
            "score" : true,
            "query" : {
                "function_score" : {
                    "script_score": {
                        "script": {
                              "lang": "painless",
                              "inline":{
                                  "double total = 0;"
                                  "for item in doc['numbers']{
                                      "total+ = params.multiply_factor*doc['numbers'].value" 
                                  "}"
                                   return total;
                              },
                               "params": {
                                    multiply_factor: 7
                               }    
                         }
                    }
                }
            }
        }
    }
}
In the above example, an iteration is being performed on the numbers field of the parent type adding the values of the arrays multiplied by the multiply_factor. In the params field one can define parameters that are being used in the script. The score returned by the script is used to sort the child documents and it is being aggregated with any other scores derived from other queries.
These are just some of the ElasticSearch features that we have used during our software implementation. Elastic has more for one to investigate and it is awesome that every time we use it, we keep finding new useful features!
That’s all for now! We will keep updating with new material 🙂

Super Basic Design Principles

Recently I was tasked with re-stylising the HelloUni catalogue for ICAN FUTURE STAR. This is a small document that outlines what we can offer to clients. As part of our regular Knowledge Transfers, I decided to do a talk on the basics of stylising a document named “Super Basic Design Principles According to Ellie”. I’m going to go through this talk with you all now in the form of a blog.

 

I’ve always been the type of person who likes to style documents so they flow well; so they become something that’s pleasant to look at, convey emotions and easy to read. Below I’ve outlined some of the basic techniques I use when stylising documents. Most of the points will be common sense but I hope you’ll find something of value in then nonetheless.

Know your audience

Firstly, before you even start writing the document you should know a few things about its purpose and target audience. It’s a good idea to imagine somebody from your targeted demographic reading your document. I’m going to use the example of a CV here because I think it’s something that we’ve all experienced or will experience at least once in our lives.

Who are they?

Imagine you’re writing your CV, Who’s going to read it? Well depending on the company it could go through a few people, Secretaries, Hiring Managers, Human Resources, Department Managers, Experts/Specialists, or Algorithms that sort CVs based on keywords. This can be a little daunting but what’s important here is that you shouldn’t assume prior knowledge. Applying for a technical job doesn’t necessarily mean that the people, or robots, reading your CV are technical people. Using abbreviations or acronyms can be detrimental in these cases. So, think about your audience then write and style your document accordingly. For the CV example, I’ve added a list of Skills at the top of the page that can be picked up by machines to get through the first stage of the job process and next is a personal statement so connect on a more human level.

 

How will they read the document?

The way people consume media has changed for the majority of people. Websites are visited more often on mobile devices than on desktop/laptop browsers (source). If you’re styling a document and think that it will most likely be read on a smaller screen, like a phone or a tablet, it’s best to take that into consideration. Don’t add a lot of images in lieu of important information. Keep it balanced. You would want to avoid making the reader zoom in and out of sections of text.

 

If you’re handing the reader a physical copy of the document and it’s likely they won’t read it there and then and front of you then try adding a front cover; something appealing that can catch somebody’s attention from across the room. This way it’s more likely that they’ll read the document when they’re ready to.

 

In my example, the CV is designed for a technical position, it’s likely this document would be passed around a company via email or printed out so people could write notes on it. For this reason, the CV is mainly white with plenty of space around the borders and sections. It’s a simple orange and black colour design and does not contain images. This will allow the file size to remain small and will not waste a lot of ink if printed. The text is also a standard size that can be read from a tablet screen and all the important information it’s presented in a single page so readers don’t need to scroll down or turn the page to see potentially hidden information.

 

White Space

Next up is the use of white space. White Space is simply an absence of information. A very useful principle that has a lot of benefits.

 

It gives any document logical breaks and allows for elements to be grouped into sections, this makes information much easier to read and generally looks more professional. Your white space also doesn’t necessarily need to be white, it just needs to be an absence of information.

 

In the CV example, I used white space to break up the sections. I’ve also given the document a much thick border between the edge of the page and the information. This is so if printed people can write notes directly onto the CV to be passed around the company.

 

Emphasis and Consistency

Simply put, emphasis and consistency are just how you use whitespace and colours. There are a lot of articles online that talk about colour theory far more articulately than I can so I’ll leave that up to them. Once you’ve got your colours and the emotions you want to convey with the document you should stick to them. Simple things like header and body fonts, colours, and size being consistent makes a huge difference to the credibility of a document. These improvements should make the document look deliberate and thought out to the reader even if they don’t notice it themselves.

 

If you have a lot of information it’s good to emphasise the key points. Some people are skimmers and scanners; they’ll read the text quickly trying to pick out information and avoiding the filler text such as and, the, and they.

 

It depends entirely on the purpose of the document and the target reader whether or not to highlight words inside paragraphs or just sticking to headers. In the CV example, I’ve only highlighted the hyperlinks as the CV is using an analogous orange style. Having too much orange will end up deemphasizing the links, this is something we’d want to avoid as the links either point towards contacting me or point to previous work.

Flow

Last I’d like to briefly mention the importance of flow. Flow is an interesting concept to me as it’s easy to explain but can be difficult to put into action efficiently. It’s not necessarily about ordering information in order of importance. Information can be presented in chronological order, you can present information as if you were telling a story. The main point is to get the information to flow logically. When I was doing my presentation I said something similar to “You wouldn’t read the even pages of a book then go back and read the odd”.

 

The flow of the CV is as follows:

The reader is automatically drawn to the name, it’s large, orange and at the top of the page, I did this so my name is in the head of the reader while they continued reading, I want to be memorable. Next, they’ll register the links under the name, it should be obvious that they’re just contactable links so the reader intentionally or unintentionally skips them to get to the important information.

 

Skills are next. If the reader is a robot it will pick out the keywords quickly. If the reader is a technically savvy person they might spend some time looking at the skills to see if I have what it would take to do the job. If the reader is non-technical they will skip this section.

 

Next is the personal statement, the human component. I want to try to establish a connection with the reader. I talk about what motivates me and my hobbies. By this time hopefully, I’ve got the attention of any reader. Robot, Tech-Savvy and Non-Technical.

 

Experience and Academia on a CV are obviously important sections to add. As previously mentioned I’ve written these as rows and columns. This way the reader can just look down the first row to find what I was doing and when they can then read a little more about that time. In each description, I’ve spoken about the time from a technical view and where possible added links to the projects I had worked on. This should please the robot, the Tech-Savvy and it should showcase to the Non-Technical reader what I’m capable of doing with the skills I had mentioned.

 

I finish the CV with additional links in a less formal style of writing and a small message to once again connect on a human level.

 

Tip of the Iceberg

There’s a whole world out there of design patterns, tips, best practices etc. This article is just meant to be the super basic design principles that I’ve found help me personally and professionally. I believe the design of a document is just as important as the content and I hope you can see why.

 

Thanks for reading

Lunch time: Beyond nourishing

As a Spaniard, lunch time in the UK is one of the toughest cultural shocks I face. In Mediterranean cultures, food is a big thing. It is not only a mere act of getting the nutrients your body needs to, basically, not die. It is a truly social act where you gather together around a table and food is just an excuse. Back in the early days when I arrived in Scotland, I worked as an industrial cleaner at the headquarters of one important British company. I remember cleaning loads of rests of food from the desks. Sometimes I actually saw people eating their lunches in front of their computers while they kept on working. And I could not help thinking: “this is wrong”. Moreover, I used to live near the financial district in Edinburgh (if such thing even exists) and I used to see workers from the banks offices around eating a sandwich in their cars at lunch time. A genuine anti-social act.

The developers at ICAN take a Mediterranean approach. Well… this is not hard having a Greek and a Spaniard on board. Thus, every day we all go together for lunch to one of the many different options available around the office. Sometimes agreeing on the place may be a bit hard, but in general agreement is reached quite fast. During this time we get out of our desks and we speak about everything but work: from our individual daily problems to our own cultures, languages, history, current affairs, food, cooking… basically, anything but work. Definitely, all the opposite to a British lunch-time.

The benefits of this approach are obvious. Not only is the time while we recover the energy levels to finish the working day, but it is a time where we get to know each other better. This not only ties bonds. It also provides us with better communication patterns. Similarly, it allows us know the general emotional state of the squad, and makes us understanding better the idiosyncrasy of each of the members of the team. It also helps to overcome the obvious cultural gaps in a multi-national crew like ours, which helps the improvement of the inter-personal relationships. In simple terms: something as mundane as eating becomes an unquestionably team-building activity.

Yet these are not the only benefits. This lunch time acts as a real break. The fact of leaving your desk, leaving the office and stop talking about work, creates a new context with a different frame of mind. And for people who work in a highly demanding intellectual activity like programming, this is essential. Therefore, lunch is also the time while the brain also rests. And this also helps the productivity.

Having said all this, I must also say I am sick of eating pasta, pizza and burgers during my working week. Because if lunch is all that I said before, as a Spaniard it is also about eating a proper heart-warming meal. And at this point the cultural gap is still too large, I am afraid.