Monday 30 October 2017

Apache DRILL - Troubleshooting ChannelClosedException 


AIM : 

To discuss few tuning parameters that can help to avoid "ChannelClosedException" errors in DRILL.
Please note that this a generic troubleshooting guide.

SYMPTOMS :

Following error messages reported during query failure could be becasue of "ChannelClosedException".
[1] o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: ChannelClosedException: Channel closed /<hostname1>:<port> <--> /<hostname2>:<port2>.
[2] SYSTEM ERROR: Drill Remote Exception

Case 1 : o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: ChannelClosedException: Channel closed /<hostname1>:<port> <--> /<hostname2>:<port2>.
This may be because of "OutOfMemory" issue in any one of the drillbit node. More troubleshooting quide and tuning details can be found my previoous blog.
https://bigdatainourpalm.blogspot.com/2015/10/apache-drill-troubleshooting.html

Case 2 : If you see logs similar to following, it will be mostly due to timeout issues at client side. 
(i)  o.a.d.exec.rpc.RpcExceptionHandler - Exception occurred with closed channel.  Connection: /<hostname1>:<port> <--> /<hostname2>:<port2>. (user server)
(ii) o.a.d.exec.rpc.RpcExceptionHandler - Exception occurred with closed channel.  Connection: /<hostname1>:<port> <--> /<hostname2>:<port2>. (user client)

More error stack is given below:

2015-04-26 04:30:50,430 [27856ea2-2b5c-7908-9e89-aef5eec27034:frag:8:147] INFO  o.a.d.e.w.f.FragmentStatusReporter - 27856ea2-2b5c-7908-9e89-aef5eec27034:8:147: State to report: RUNNING
2015-04-26 04:31:11,825 [UserServer-1] INFO  o.a.drill.exec.rpc.user.UserServer - RPC connection /XXX.YY.ZZ.YY:31010 <--> /XYX.YU.YZ.YY:33320 (user server) timed out.  Timeout was set to 30 seconds. Closing connection.
2015-04-26 04:31:11,839 [27856ea2-2b5c-7908-9e89-aef5eec27034:frag:0:0] INFO  o.a.d.e.w.fragment.FragmentExecutor - 2613806b-9b7b-1d29-3017-dab770388115:0:0: State change requested RUNNING --> FAILED
2015-04-26 04:31:11,841 [UserServer-1] INFO  o.a.d.e.w.fragment.FragmentExecutor - 2667806b-9b7b-1d29-3017-dab770388115:0:0: State change requested FAILED --> FAILED
2015-04-26 04:31:11,841 [27856ea2-2b5c-7908-9e89-aef5eec27034:frag:0:0] INFO  o.a.d.e.w.fragment.FragmentExecutor - 2667806b-9b7b-1d29-3017-dab770388115:0:0: State change requested FAILED --> FAILED

The key here is 'user server'/'user client' message in the logs.
The default timeout value is 30 seconds. 

SOLUTION/WORKAROUND :

Increase the value of user client timeout. This can be done by setting 'drill.exec.user.timeout' property in 'drill-override.conf' file.
An example is given below (changing timeout to 300 seconds or 5 minutes):

cat drill-override.conf

drill.exec: {
cluster-id: "ajames-drillbits",
zk.connect: "<hn>:5181",
rpc.user.timeout: "300"
}

You need to restart drillbit services after making the change.

Apache DRILL - TroubleShooting OutOfMemory Issues


AIM :

To discuss few tuning parameters that can help to avoid "OutOfMemory" errors in DRILL.
Please note that this a generic troubleshooting guide.

SYMPTOMS :

Drill query failures with following error messages are also potential candidates for "OutOfMemory" issues.
[1] o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: ChannelClosedException: Channel closed /<hostname1>:<port> <--> /<hostname2>:<port2>.
[2] SYSTEM ERROR: Drill Remote Exception

The above error messages are just the after effect of "OutOfMemory" that occured in any one of the drillbit node that was running a fragment of query that failed.
As a result, to ensure that the issue is indeed caused by insufficient memory, you need to check logs from all the drillbit nodes. 

You should see error logs similar to below in atleast one of the drillbit node logs:

Caused by: org.apache.drill.exec.exception.OutOfMemoryException: Failure allocating buffer. 
at io.netty.buffer.PooledByteBufAllocatorL.allocate(PooledByteBufAllocatorL.java:64) ~[drill-memory-base-1.10.0.jar:4.0.27.Final] 
at org.apache.drill.exec.memory.AllocationManager.<init>(AllocationManager.java:80) ~[drill-memory-base-1.10.0.jar:1.10.0] 
at org.apache.drill.exec.memory.BaseAllocator.bufferWithoutReservation(BaseAllocator.java:243) ~[drill-memory-base-1.10.0.jar:1.10.0] 
at org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:225) ~[drill-memory-base-1.10.0.jar:1.10.0] 
at org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:195) ~[drill-memory-base-1.10.0.jar:1.10.0] 
at io.netty.buffer.ExpandableByteBuf.capacity(ExpandableByteBuf.java:43) ~[drill-memory-base-1.10.0.jar:4.0.27.Final] 
at io.netty.buffer.AbstractByteBuf.ensureWritable(AbstractByteBuf.java:251) ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final] 
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:849) ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final] 
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:841) ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final] 
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:831) ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final] 
at io.netty.handler.codec.ByteToMessageDecoder$1.cumulate(ByteToMessageDecoder.java:92) ~[netty-codec-4.0.27.Final.jar:4.0.27.Final] 
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:227) ~[netty-codec-4.0.27.Final.jar:4.0.27.Final] 
... 11 common frames omitted 
Caused by: java.lang.OutOfMemoryError: Direct buffer memory 
at java.nio.Bits.reserveMemory(Bits.java:658) ~[na:1.7.0_51] 
at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123) ~[na:1.7.0_51] 
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306) ~[na:1.7.0_51] 
at io.netty.buffer.UnpooledUnsafeDirectByteBuf.allocateDirect(UnpooledUnsafeDirectByteBuf.java:108) ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final] 


The key here is "Caused by: java.lang.OutOfMemoryError: Direct buffer memory". 
This indicates that drill bit is running out of direct memory allocated for it in that node.


SOLUTION/WORKAROUND :

We will discuss about 3 different options to avoid such errors.

[Option 1] Play around with following two parameters:

(a) planner.width.max_per_node
(b) planner.width.max_per_query

From official Apache Drill documentation (https://drill.apache.org/docs/configuration-options-introduction/)

planner.width.max_per_node : Maximum number of threads that can run in parallel for a query on a node. A slice is an individual thread. This number indicates the maximum number of slices per query for the query’s major fragment on a node.

planner.width.max_per_query : Same as max per node but applies to the query as executed by the entire cluster. For example, this value might be the number of active Drillbits, or a higher number to return results faster.

If the drillbit is running out of memory because of more than one thread running in the same node, setting the above values to lower value and re-executing the query should help.
You can either set it at session level or system level. 

Session Level:
alter session set `planner.width.max_per_node`=1

If the query works fine with the above value, you can tune it by increasing to higher and more optimal value.
A similar approach can be taken for 'planner.width.max_per_query' as well. However, instead of starting with value 1, start tuning with value 50/100 and increase.

Both the above two properties do not require drillbit service restart.

[Option 2] Tune 'drill.exec.buffer.size' property.

From official Apache Drill documentation (https://drill.apache.org/docs/start-up-options/)

drill.exec.buffer.size : Defines the amount of memory available, in terms of record batches, to hold data on the downstream side of an operation. Drill pushes data downstream as quickly as possible to make data immediately available. This requires Drill to use memory to hold the data pending operations. When data on a downstream operation is required, that data is immediately available so Drill does not have to go over the network to process it. Providing more memory to this option increases the speed at which Drill completes a query.

The default value is 6. Following is an example how it affects the direct memory:
Let's say a query comprises of 2 major fragments. The number of minor fragments inside each major fragment is given below:
Major Fragment 0 - 1 Minor Fragment
Major Fragment 1 - 100 Minor Fragments

Major Fragment 1 will have a SENDER that sends data and RECEIVER at Major Fragment 0 received it. Major Fragment 0 will be running on the foreman node. 
'drill.exec.buffer.size' controls the size of the 100 buffers on the foreman. Foreman is accumulating batches being sent to him by 100 minor fragments in 100 buffers. Each buffer has a soft-limit size of 6 (drill.exec.buffer.size). So the foreman will essentially be responsible for holding 6 * 100 batches in memory at least. Let's say each batch is 8 mb in size. So the foreman will be responsible for holding approximately 6 * 100 * 8mb = 4.68GB of memory. However this is not all the memory being consumed on the foreman, since there is likely some unreported memory being consumed by netty buffers as well. 

So, if we change the value of 'drill.exec.buffer.size' to 3, then the memory used will be (4.68/2)=2.34GB. This can reslove the issue.
The new value is to be addeded in drill-override.conf. Below is an example for the same.

cat drill-override.conf

drill.exec: {
cluster-id: "ajames-drillbits",
zk.connect: "<hn>:5181",
buffer.size: "1",
}

Changing the above value requires restarting of drillbit services.

[Option 3] Increase direct memory 

By default it is 8GB. This is specified inside drill-env.sh file in DRILL CONF directory.
export DRILL_MAX_DIRECT_MEMORY=${DRILL_MAX_DIRECT_MEMORY:-"8G"}
This change also requires restart of drillbit services.