Polybase - maximum reject threshold (0 rows) was reached while reading from an external source: 1 rows rejected out of total 1 rows processed
[Question from customer] I have following data in a text file. Delimited by | A | null , ZZ C | D When I run this query using HDInsight: CREATE EXTERNAL TABLE myfiledata( col1 string, col2 string ) row format delimited fields terminated by '|' STORED AS TEXTFILE LOCATION 'wasb://.....'; I get the following result as expected: A null , ZZ C D But when I run the same query using SQL DW Polybase, it throws error: Query aborted-- the maximum reject threshold (0 rows) was reached while reading from an external source: 1 rows rejected out of total 1 rows processed. How do I fix this? Here's my script in SQL DW: -- Creating external data source (Azure Blob Storage) CREATE EXTERNAL DATA SOURCE azure_storage1 WITH ( TYPE = HADOOP , LOCATION ='wasbs://....blob.core.windows.net' , CREDENTIAL = ASBSecret ) ; -- Creating external file format (delimited text file) CREATE EXTERNAL FILE FORMAT text_file_format WITH ( FORMAT_TYPE = DELIMITEDTEXT , FORMAT_OPTIONS ( FIELD_TERMINATOR ='|' , USE_TYPE_DEFAULT = TRUE ) ) ; -- Creating external table pointing to file stored in Azure Storage CREATE EXTERNAL TABLE [Myfile] ( Col1 varchar(5), Col2 varchar(5) ) WITH ( LOCATION = '/myfile.txt' , DATA_SOURCE = azure_storage1 , FILE_FORMAT = text_file_format ) ;
We’re currently working on a way to bubble up the reason for reject to the user. In the meantime, here's what's happening: The default # of rows allowed to fail schema matching is 0. This means that if at least one of the rows you’re loading in from /myfile.txt doesn’t match the schema. In Hive, strings can accommodate an arbitrary amount of chars, but varchars cannot. In this case it’s failing on the varchar(5) for “null , ZZ” because that is more than 5 characters. If you’d like to change the REJECT_VALUE in the CREATE EXTERNAL TABLE call, that will let through the other row – more info can be found here: https://msdn.microsoft.com/library/dn935021(v=sql.130).aspx
Understanding Azure SQL Performance
Why is clock synchronization on servers difficult?
What is the order of the messages in a Azure Service Bus queue if I send them asynchronously?
WebJob run failed due to: System.Threading.ThreadAbortException: Thread was being aborted
Azure alerts not being sent out
How can I track a scheduled notification in Azure Notification Hub?
Semantic logging In-Proc and Out-Proc
Rich ACLs with Azure Storage - delegating to AD?
Use azure media service / server with xamarin
Azure's CloudContext.Clients.Create????ManagementClient methods deprecated?
ProjectServer 2013 REST APIs with Windows Azure Access Token
Issue in azure search result when use both search keyword and Orderby clues
Azure portal - delete database server [closed]
Azure DocumentDb Consistency level suggestion
Backup files from linux vm in Azure
How do you support FIFO message ordering with Azure Service Bus partitioned queues/topics?