azure


Polybase - maximum reject threshold (0 rows) was reached while reading from an external source: 1 rows rejected out of total 1 rows processed


[Question from customer]
I have following data in a text file. Delimited by |
A | null , ZZ
C | D
When I run this query using HDInsight:
CREATE EXTERNAL TABLE myfiledata(
col1 string,
col2 string
)
row format delimited fields terminated by '|' STORED AS TEXTFILE LOCATION 'wasb://.....';
I get the following result as expected:
A null , ZZ
C D
But when I run the same query using SQL DW Polybase, it throws error:
Query aborted-- the maximum reject threshold (0 rows) was reached while reading from an external source: 1 rows rejected out of total 1 rows processed.
How do I fix this?
Here's my script in SQL DW:
-- Creating external data source (Azure Blob Storage)
CREATE EXTERNAL DATA SOURCE azure_storage1
WITH
(
TYPE = HADOOP
, LOCATION ='wasbs://....blob.core.windows.net'
, CREDENTIAL = ASBSecret
)
;
-- Creating external file format (delimited text file)
CREATE EXTERNAL FILE FORMAT text_file_format
WITH
(
FORMAT_TYPE = DELIMITEDTEXT
, FORMAT_OPTIONS (
FIELD_TERMINATOR ='|'
, USE_TYPE_DEFAULT = TRUE
)
)
;
-- Creating external table pointing to file stored in Azure Storage
CREATE EXTERNAL TABLE [Myfile]
(
Col1 varchar(5),
Col2 varchar(5)
)
WITH
(
LOCATION = '/myfile.txt'
, DATA_SOURCE = azure_storage1
, FILE_FORMAT = text_file_format
)
;
We’re currently working on a way to bubble up the reason for reject to the user.
In the meantime, here's what's happening:
The default # of rows allowed to fail schema matching is 0. This means that if at least one of the rows you’re loading in from /myfile.txt doesn’t match the schema. In Hive, strings can accommodate an arbitrary amount of chars, but varchars cannot. In this case it’s failing on the varchar(5) for “null , ZZ” because that is more than 5 characters.
If you’d like to change the REJECT_VALUE in the CREATE EXTERNAL TABLE call, that will let through the other row – more info can be found here: https://msdn.microsoft.com/library/dn935021(v=sql.130).aspx

Related Links

Understanding Azure SQL Performance
Why is clock synchronization on servers difficult?
What is the order of the messages in a Azure Service Bus queue if I send them asynchronously?
WebJob run failed due to: System.Threading.ThreadAbortException: Thread was being aborted
Azure alerts not being sent out
How can I track a scheduled notification in Azure Notification Hub?
Semantic logging In-Proc and Out-Proc
Rich ACLs with Azure Storage - delegating to AD?
Use azure media service / server with xamarin
Azure's CloudContext.Clients.Create????ManagementClient methods deprecated?
ProjectServer 2013 REST APIs with Windows Azure Access Token
Issue in azure search result when use both search keyword and Orderby clues
Azure portal - delete database server [closed]
Azure DocumentDb Consistency level suggestion
Backup files from linux vm in Azure
How do you support FIFO message ordering with Azure Service Bus partitioned queues/topics?

Categories

HOME
math
matlab
performance-testing
auth0
android-activity
apacheds
isis
wxpython
logback
google-analytics-api
cross-domain
drag
overloading
avr
calabash-android
deserialization
python-2.5
phpbb3
records
curve-fitting
contains
jruby
advantage-database-server
dotnetbar
nice-language
structure
spring-batch-admin
desktop-app-converter
toolbar
apscheduler
ocean
android-imageview
gcp
kubernetes-go-client
swipe
background-process
inria-spoon
spring-data-cassandra
transpose
tibco-ems
protobuf-3
amazon-vpc
nexus-5
ase
hydra
gulp-babel
sections
complement
libev
harvest-scm
catalina
geneticsharp
android-sdk-tools
sqlite-net
vispy
hapi.js
git-config
olingo
post-increment
vlc-android
rras
radix-sort
xirr
biojava
fortrabbit
adserver
iscroll4
opensc
eventmachine
populate
requiredfieldvalidator
validform
parsekit
deap
zephir
firebird2.1
double-precision
harvest
jsr223
hpple
proxy-classes
program-files
process.start
boost-gil
hunchentoot
symstore
jboss-cache
program-transformation
n900
webshop
sloc
dynamic-websites
gedcom

Resources

Mobile Apps Dev
Database Users
javascript
java
csharp
php
android
MS Developer
developer works
python
ios
c
html
jquery
RDBMS discuss
Cloud Virtualization
Database Dev&Adm
javascript
java
csharp
php
python
android
jquery
ruby
ios
html
Mobile App
Mobile App
Mobile App