Class CqlOutputFormat
- java.lang.Object
-
- org.apache.hadoop.mapreduce.OutputFormat<java.util.Map<java.lang.String,java.nio.ByteBuffer>,java.util.List<java.nio.ByteBuffer>>
-
- org.apache.cassandra.hadoop.cql3.CqlOutputFormat
-
- All Implemented Interfaces:
org.apache.hadoop.mapred.OutputFormat<java.util.Map<java.lang.String,java.nio.ByteBuffer>,java.util.List<java.nio.ByteBuffer>>
public class CqlOutputFormat extends org.apache.hadoop.mapreduce.OutputFormat<java.util.Map<java.lang.String,java.nio.ByteBuffer>,java.util.List<java.nio.ByteBuffer>> implements org.apache.hadoop.mapred.OutputFormat<java.util.Map<java.lang.String,java.nio.ByteBuffer>,java.util.List<java.nio.ByteBuffer>>
TheCqlOutputFormat
acts as a Hadoop-specific OutputFormat that allows reduce tasks to store keys (and corresponding bound variable values) as CQL rows (and respective columns) in a given table.As is the case with the
org.apache.cassandra.hadoop.ColumnFamilyInputFormat
, you need to set the prepared statement in your Hadoop job Configuration. TheCqlConfigHelper
class, through itsCqlConfigHelper.setOutputCql(org.apache.hadoop.conf.Configuration, java.lang.String)
method, is provided to make this simple. you need to set the Keyspace. TheConfigHelper
class, through itsConfigHelper.setOutputColumnFamily(org.apache.hadoop.conf.Configuration, java.lang.String)
method, is provided to make this simple.For the sake of performance, this class employs a lazy write-back caching mechanism, where its record writer prepared statement binded variable values created based on the reduce's inputs (in a task-specific map), and periodically makes the changes official by sending a execution of prepared statement request to Cassandra.
-
-
Field Summary
Fields Modifier and Type Field Description static java.lang.String
BATCH_THRESHOLD
static java.lang.String
QUEUE_SIZE
-
Constructor Summary
Constructors Constructor Description CqlOutputFormat()
-
Method Summary
All Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description protected void
checkOutputSpecs(org.apache.hadoop.conf.Configuration conf)
void
checkOutputSpecs(org.apache.hadoop.fs.FileSystem filesystem, org.apache.hadoop.mapred.JobConf job)
Deprecated.void
checkOutputSpecs(org.apache.hadoop.mapreduce.JobContext context)
Check for validity of the output-specification for the job.org.apache.hadoop.mapreduce.OutputCommitter
getOutputCommitter(org.apache.hadoop.mapreduce.TaskAttemptContext context)
The OutputCommitter for this format does not write any data to the DFS.org.apache.cassandra.hadoop.cql3.CqlRecordWriter
getRecordWriter(org.apache.hadoop.fs.FileSystem filesystem, org.apache.hadoop.mapred.JobConf job, java.lang.String name, org.apache.hadoop.util.Progressable progress)
Deprecated.org.apache.cassandra.hadoop.cql3.CqlRecordWriter
getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext context)
Get theRecordWriter
for the given task.
-
-
-
Field Detail
-
BATCH_THRESHOLD
public static final java.lang.String BATCH_THRESHOLD
- See Also:
- Constant Field Values
-
QUEUE_SIZE
public static final java.lang.String QUEUE_SIZE
- See Also:
- Constant Field Values
-
-
Method Detail
-
checkOutputSpecs
public void checkOutputSpecs(org.apache.hadoop.mapreduce.JobContext context)
Check for validity of the output-specification for the job.- Specified by:
checkOutputSpecs
in classorg.apache.hadoop.mapreduce.OutputFormat<java.util.Map<java.lang.String,java.nio.ByteBuffer>,java.util.List<java.nio.ByteBuffer>>
- Parameters:
context
- information about the job
-
checkOutputSpecs
protected void checkOutputSpecs(org.apache.hadoop.conf.Configuration conf)
-
checkOutputSpecs
@Deprecated public void checkOutputSpecs(org.apache.hadoop.fs.FileSystem filesystem, org.apache.hadoop.mapred.JobConf job) throws java.io.IOException
Deprecated.Fills the deprecated OutputFormat interface for streaming.- Specified by:
checkOutputSpecs
in interfaceorg.apache.hadoop.mapred.OutputFormat<java.util.Map<java.lang.String,java.nio.ByteBuffer>,java.util.List<java.nio.ByteBuffer>>
- Throws:
java.io.IOException
-
getOutputCommitter
public org.apache.hadoop.mapreduce.OutputCommitter getOutputCommitter(org.apache.hadoop.mapreduce.TaskAttemptContext context) throws java.io.IOException, java.lang.InterruptedException
The OutputCommitter for this format does not write any data to the DFS.- Specified by:
getOutputCommitter
in classorg.apache.hadoop.mapreduce.OutputFormat<java.util.Map<java.lang.String,java.nio.ByteBuffer>,java.util.List<java.nio.ByteBuffer>>
- Parameters:
context
- the task context- Returns:
- an output committer
- Throws:
java.io.IOException
java.lang.InterruptedException
-
getRecordWriter
@Deprecated public org.apache.cassandra.hadoop.cql3.CqlRecordWriter getRecordWriter(org.apache.hadoop.fs.FileSystem filesystem, org.apache.hadoop.mapred.JobConf job, java.lang.String name, org.apache.hadoop.util.Progressable progress) throws java.io.IOException
Deprecated.Fills the deprecated OutputFormat interface for streaming.- Specified by:
getRecordWriter
in interfaceorg.apache.hadoop.mapred.OutputFormat<java.util.Map<java.lang.String,java.nio.ByteBuffer>,java.util.List<java.nio.ByteBuffer>>
- Throws:
java.io.IOException
-
getRecordWriter
public org.apache.cassandra.hadoop.cql3.CqlRecordWriter getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext context) throws java.io.IOException, java.lang.InterruptedException
Get theRecordWriter
for the given task.- Specified by:
getRecordWriter
in classorg.apache.hadoop.mapreduce.OutputFormat<java.util.Map<java.lang.String,java.nio.ByteBuffer>,java.util.List<java.nio.ByteBuffer>>
- Parameters:
context
- the information about the current task.- Returns:
- a
RecordWriter
to write the output for the job. - Throws:
java.io.IOException
java.lang.InterruptedException
-
-