Monday, December 23, 2013

Elasticsearch index slow log for search and indexing

Today, we are going to learn on the logging for elasticsearch for its search and index. In elasticsearch config file, elasticsearch.yml, it should have a configuration such as below:
################################## Slow Log ##################################

# Shard level query and fetch threshold logging. 10s 5s 2s 500ms 1s 800ms 500ms 200ms

#index.indexing.slowlog.threshold.index.warn: 10s 5s
#index.indexing.slowlog.threshold.index.debug: 2s
index.indexing.slowlog.threshold.index.trace: 500ms

So with this example, I have enable tracing for search query and search fetch with 500ms and 200ms respectively. A search in elasticsearch consists of query time and fetch time. Hence the two configuration for search. Meanwhile, logging for elasticsearch index is also enable with a threshold of 500ms.

With these configuration sets, and if your indexing or search exceed that threshold,
an entry will be log into a file. The logging file should be located in path.log
that is set in elasticsearch.yml.

So what does the number really means? Excerpts from elasticsearch official documentation

The logging is done on the shard level scope, meaning the executionof a search request within a specific shard. It does not encompass the whole search request, which can be broadcast to several shards in order to execute. Some of the benefits of shard level logging is the association of the actual execution on the specific machine, compared with request level.


All settings are index level settings (and each index can have different values for it), and can be changed in runtime using the indexupdate settings API.


... and, I have tried updating the index setting via a simple tool I've made earlier on. But the idea is same, you just need to http get by putting the variable into the index setting. You can find more information here The key for the configuration is available at class.
[jason@node1 bin]$ ./ set search.slowlog.threshold.query.trace 500
"ok" : true,
"acknowledged" : true

[2013-12-23 12:31:12,758][TRACE][] [node1] [index_test][146] took[1s], took_millis[1026], types[foo,bar], stats[], search_type[QUERY_THEN_FETCH], total_shards[90], source[{"size":80,"timeout":10000,"query":{"filtered":{"query":{"query_string":{"query":"maxis*","default_operator":"and"}},"filter":{"and":{"filters":[{"query":{"match":{"site":{"query":"","type":"boolean"}}}},{"range":{"unixtimestamp":{"from":null,"to":1387825199000,"include_lower":true,"include_upper":true}}}]}}}},"filter":{"query":{"match":{"site":{"query":"","type":"boolean"}}}},"sort":[{"unixtimestamp":{"order":"desc"}}]}], extra_source[],

With this example, it has exceed the threshold set at 500ms which it ran for 1 second.

As for indexing, the fundamental concept is the same, so we won't elaborate in this article and that should leave you as a tutorial. :-)

Sunday, December 22, 2013

Learning Jmxterm

If you have been using jconsole to inspect an application perform under jvm, you might want to look for alternative in command line form. In this article, we are going to spend sometime to learn on Jmxterm . So what is a Jmxterm? Jmxterm is a command line based interactive JMX client. It's designed to allow user to access a Java MBean server from command line without graphical environment. In another word, it's a command line based jconsole.

To get started, you will of cause, needed JDK installed and an java application that you want to inspect. To start using it , go to and start to download. You should have a jmxterm-[version].jar file.

So, I'm gonna demonstrate on how to use Jmxterm by showing with examples of a terminal output.
$ java -jar jmxterm-1.0-alpha-4-uber.jar
Welcome to JMX terminal. Type "help" for available commands.
#IllegalArgumentException: Command help; isn't valid, run help to see available commands
#following commands are available to use:
about - Display about page
bean - Display or set current selected MBean.
beans - List available beans under a domain or all domains
bye - Terminate console and exit
close - Close current JMX connection
domain - Display or set current selected domain.
domains - List all available domain names
exit - Terminate console and exit
get - Get value of MBean attribute(s)
help - Display available commands or usage of a command
info - Display detail information about an MBean
jvms - List all running local JVM processes
open - Open JMX session or display current connection
option - Set options for command session
quit - Terminate console and exit
run - Invoke an MBean operation
set - Set value of an MBean attribute
$> bean
#IllegalStateException: Connection isn't open yet. Run open command to open a connection
#following domains are available
#IllegalStateException: Connection isn't open yet. Run open command to open a connection
5552 ( ) - jmxterm-1.0-alpha-4-uber.jar
$>help open
usage: open [-h] [-p <val>] [-u <val>]
Open JMX session or display current connection
-h,--help Display usage
-p,--password <val> Password for user/password authentication
-u,--user <val> User name for user/password authentication
Without argument this command display current connection. URL can be a <PID>,
<hostname>:<port> or full qualified JMX service URL. For example
open localhost:9991,
open jmx:service:...
#RuntimeIOException: Runtime IO exception: Connection refused to host:; nested exception is: Connection refused
$>open localhost:7199
#Connection to localhost:7199 is opened
$>bean org.apache.cassandra.db:columnfamily=IndexInfo,keyspace=system,type=ColumnFamilies
#bean is set to org.apache.cassandra.db:columnfamily=IndexInfo,keyspace=system,type=ColumnFamilies
#mbean = org.apache.cassandra.db:columnfamily=IndexInfo,keyspace=system,type=ColumnFamilies
#class name = org.apache.cassandra.db.ColumnFamilyStore
# attributes
%0 - AutoCompactionDisabled (boolean, r)
%1 - BloomFilterDiskSpaceUsed (long, r)
%2 - BloomFilterFalsePositives (long, r)
%3 - BloomFilterFalseRatio (double, r)
%4 - BuiltIndexes (java.util.List, r)
%5 - ColumnFamilyName (java.lang.String, r)
%6 - CompactionStrategyClass (java.lang.String, rw)
%7 - CompressionParameters (java.util.Map, rw)
%8 - CompressionRatio (double, r)
%9 - CrcCheckChance (double, w)
%10 - DroppableTombstoneRatio (double, r)
%11 - EstimatedColumnCountHistogram ([J, r)
%12 - EstimatedRowSizeHistogram ([J, r)
%13 - LifetimeReadLatencyHistogramMicros ([J, r)
%14 - LifetimeWriteLatencyHistogramMicros ([J, r)
%15 - LiveCellsPerSlice (double, r)
%16 - LiveDiskSpaceUsed (long, r)
%17 - LiveSSTableCount (int, r)
%18 - MaxRowSize (long, r)
%19 - MaximumCompactionThreshold (int, rw)
%20 - MeanRowSize (long, r)
%21 - MemtableColumnsCount (long, r)
%22 - MemtableDataSize (long, r)
%23 - MemtableSwitchCount (int, r)
%24 - MinRowSize (long, r)
%25 - MinimumCompactionThreshold (int, rw)
%26 - PendingTasks (int, r)
%27 - ReadCount (long, r)
%28 - RecentBloomFilterFalsePositives (long, r)
%29 - RecentBloomFilterFalseRatio (double, r)
%30 - RecentReadLatencyHistogramMicros ([J, r)
%31 - RecentReadLatencyMicros (double, r)
%32 - RecentSSTablesPerReadHistogram ([J, r)
%33 - RecentWriteLatencyHistogramMicros ([J, r)
%34 - RecentWriteLatencyMicros (double, r)
%35 - SSTableCountPerLevel ([I, r)
%36 - SSTablesPerReadHistogram ([J, r)
%37 - TombstonesPerSlice (double, r)
%38 - TotalDiskSpaceUsed (long, r)
%39 - TotalReadLatencyMicros (long, r)
%40 - TotalWriteLatencyMicros (long, r)
%41 - UnleveledSSTables (int, r)
%42 - WriteCount (long, r)
# operations
%0 - long estimateKeys()
%1 - void forceMajorCompaction()
%2 - java.util.List getSSTablesForKey(java.lang.String p1)
%3 - void loadNewSSTables()
%4 - void setCompactionThresholds(int p1,int p2)
#there's no notifications
$>get WriteCount
#mbean = org.apache.cassandra.db:columnfamily=IndexInfo,keyspace=system,type=ColumnFamilies:
WriteCount = 0;
$>get TotalDiskSpaceUsed
#mbean = org.apache.cassandra.db:columnfamily=IndexInfo,keyspace=system,type=ColumnFamilies:
TotalDiskSpaceUsed = 9437;

So, a brief explanation on what I have just done. To start, you will need to run the Jmxterm from a terminal. To understand what commands it has and what can you use for, simply issued command help. In order to inspect, you will need to open a connection to the jvm. Once a connection is established, you get do all sort of operations and in this example, I'm connected to cassandra and inspect on its bean org.apache.cassandra.db:columnfamily=IndexInfo,keyspace=system,type=ColumnFamilies and get the WriteCount and TotalDiskSpaceUsed statistics.

That's all folks! Hope you get an idea on what it does and where it is applicable to you.

Saturday, December 21, 2013

vifm a true gem

filemanager goes vim

  * A ncurse file manager, with vim like UI for a vim user you will feel right
at home, command like the dd delete line just like in vim, and move to other
window and type p past the line in the "clipboard" to it. And the normal move
command like the hjkl works as expected jk down/up item in list and hl
up/down directory. like in vim most settings are made in it's rc file, the
vifmrc is in ~/.vifm to get an idea about all the option you have in vifm go
to it's amazing what you can do
with vifm, if you like me has been using vim for sometime this filemanager is
a true gem. And much like vim, the options are "endless" browse on the project
homepage or at You will find the sourcecode/setup and help to make your own
setup. I often look at the config/setup on github to get idea's and mabye
improve my setup.

* On is the documentation for vifm.

* I must say after using vifm for some time, and done some github'ing made my
own vifmrc some nice filetype setting and hard-bookmarks, it's like vim the
more you use it, and add to your rc file the better and faster it get.

  * So thanks to ksteen & xaizek for this power-tool.
Don't look as much, but it is ;)

Friday, December 20, 2013

A maven introduction.

So recently, I have been working on an opensource project and stumble upon maven. So I'm all ant guy (with ant background), and guess that to use maven should not be that difficult to start using it.

When you see a file in the java project, pom.xml, this should tell you that it is a maven configuration file. So for instance, in a as simple java project, it would look like
|project home |
| +-------+
+------+ src |
| +---+---+
| | +------+
| +-----+ main |
| | +--+---+
| | | +----------+
| | +----+ java |
| | | +----------+
| | | +----------+
| | +----+ resources|
| | +----------+
| | +------+
| +-----+ test |
| +--+---+
| | +----------+
| +----+ java |
| | +----------+
| | +----------+
| +----+ resources|
| +--------+ +----------+
+-----+ target |
| +--------+
| +--------+
+-----+ pom.xml|

The most basic command that you ever gonna use and use it very often would probably

mvn package

With above command, mvn will compile your class, run any tests and package the deliverable code and resources into target/my-app-1.0.jar . If mvn produced this jar, this should be enough and that the developer should be able to concentrate the java project.

But if you are adventurous and want to know more about maven, continue to read on. There are a few maven phases which you can issue the command. The following is the standard maven lifecycle with an ordered phases.

  • process-resources

  • compile

  • process-test-resources

  • test-compile

  • test

  • package

  • install

  • deploy

So in order to satisfy the library dependencies of your project, you should specify coordinate of the lib that it depends into pom.xml. You can use  this site to search for the libraries it depends.

I hope this answer a simple start up to use maven to assist in your java project. If you reach here and have further question, this link  and this link .

Changing ElasticSearch logging level by updating cluster setting.

In this article, we are going to learn how to update logging for all the in the elasticsearch cluster. Because logging is crucial in understanding the system behaviour, so from time to time, change the logging level in elasticsearch via elasticsearch.yml and restart elasticsearch instance so that the logging level will be pick up. Unfortunately restart on the live production will take sometime (because of the shards recovery) and this could not be efficient.

Luckily, there is a setting in the cluster which allow the logging level to be change on the fly.

So with that, if you want to understand the what's happening in the cluster node, you can change the logging

curl -XPUT localhost:9200/_cluster/settings -d '{
"transient" : {
"logger.cluster.service" : "DEBUG"

and tail the elasticsearch log, you should see some log started appearing. Because logging is managed by the class NodeSettingsService, so you should read into the elasticsearch package that initialized with this class. Example elasticsearch package, cluster.service, cluster.routing.allocation.allocator, indices.ttl.IndicesTTLService, etc. Note that the package prefix, org.elastic is not needed when the setting is updated.

If you want more information, this link would provide better help.

Friday, December 6, 2013

cassandra 2.0 catch 101 – part4

It has been a while since I last post, mainly was due to the abundane works. :-(  In this article, I'm gonna share with the lesson learned on cassandra 2.0.2 learned using cqlsh 4.1.0.

Last we had to remove all the files in /var/lib/cassandra/ simply because somewhere it break when we upgraded from cassandra 2.0.0 to 2.0.2 and everybody in the teams just do not have the time to goes into details. So since this is just4fun cluster, we agreed to removed the dir /var/lib/cassandra/ and start the cluster using cassandra 2.0.2.

In order to better understand cassandra, we take a detail look at alter table. But before that, let's create a new keyspace and table.
cqlsh> CREATE KEYSPACE jw_schema1 WITH replication = {'class':'SimpleStrategy', 'replication_factor':3};

and the correspondance cassandra system.log

INFO [Thrift:7] 2013-12-06 16:17:21,902 (line 217) Create new Keyspace: jw_schema1, rep strategy:SimpleStrategy{}, strategy_options: {replication_factor=3}, durable_writes: true

cassandra 2.0 catch 101 – part3

So many of us are from mysql / postgres background and we quickly interface to the database using the command line. In order to comment in cassandra cql, it is different than in sql. Read the example below
cqlsh:jw_schema1> #select * from users;
Invalid syntax at line 1, char 1
#select * from users;
cqlsh:jw_schema1> --select * from users;
cqlsh:jw_schema1> -select * from users;
Bad Request: line 1:0 no viable alternative at input '-'
cqlsh:jw_schema1> -- select * from users;

So as you can see, the hash glyph do not work in cqlsh, you need to use double dashes in front of the comment you want to made.

Voila! =)