Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery on python. upload_from_file job not done #3343

Closed
patitonav opened this issue Apr 28, 2017 · 5 comments
Closed

BigQuery on python. upload_from_file job not done #3343

patitonav opened this issue Apr 28, 2017 · 5 comments
Assignees
Labels
api: bigquery Issues related to the BigQuery API.

Comments

@patitonav
Copy link

patitonav commented Apr 28, 2017

I am running a test to upload a .json file into an bigquery table and i cant get the job done.
Here is my code :

def stream_data2(dataset_name, table_name, json_data):
    bigquery_client = bigquery.Client()
    dataset = bigquery_client.dataset(dataset_name)
    table = dataset.table(table_name)

    # Reload the table to get the schema.
    table.reload()
	
    job = table.upload_from_file(
        json_data, source_format='NEWLINE_DELIMITED_JSON', 
        encoding='UTF-8',
        create_disposition='CREATE_IF_NEEDED',
        write_disposition='WRITE_APPEND')
    pprint.pprint(job)


with open('clarin.json', 'rb') as json_data:
    stream_data2(DATASET_NAME, TABLE_NAME, json_data)

The job ended in fail with this error:

> Errors:
> file-00000000: JSON table encountered too many errors, giving up. Rows: 1; errors: 1. (error code: invalid)
> file-00000000: JSON parsing error in row starting at position 0: . No such field: custom. (error code: invalid)
> 

This has no sense becouse i create the table with a json file with the same structure.

Any suggestion?

Thanks

@dhermes dhermes added the api: bigquery Issues related to the BigQuery API. label Apr 28, 2017
@dhermes
Copy link
Contributor

dhermes commented Apr 28, 2017

@tswast Is JSON even supported as an upload type? (source_format='NEWLINE_DELIMITED_JSON' seems to indicate yes)

@tswast
Copy link
Contributor

tswast commented Apr 28, 2017

Yes, JSON is supported. Keep in mind that it's "Newline Delimited". That means each row not table is a separate JSON object, each separated by a newline.

@patitonav
Copy link
Author

patitonav commented Apr 28, 2017 via email

@tswast
Copy link
Contributor

tswast commented Apr 28, 2017

I recommend using the bq command line tool to experiment with the file format & schema. Once you have a file that BigQuery accepts, then you can automate it with this library. https://cloud.google.com/bigquery/bq-command-line-tool

@dhermes
Copy link
Contributor

dhermes commented Apr 28, 2017

@patitonav I am going to pre-emptively close this, since it seems your issue is with your data formatting and this isn't really an issue with this library.

Let me know if you'd like to re-open this issue.

Thanks for the assist @tswast, helpful as always!

@dhermes dhermes closed this as completed Apr 28, 2017
parthea pushed a commit that referenced this issue Jun 19, 2024
Bumps [urllib3](https://github.com/urllib3/urllib3) from 2.0.7 to 2.2.2.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/urllib3/urllib3/releases">urllib3's
releases</a>.</em></p>
<blockquote>
<h2>2.2.2</h2>
<h2>🚀 urllib3 is fundraising for HTTP/2 support</h2>
<p><a
href="https://sethmlarson.dev/urllib3-is-fundraising-for-http2-support">urllib3
is raising ~$40,000 USD</a> to release HTTP/2 support and ensure
long-term sustainable maintenance of the project after a sharp decline
in financial support for 2023. If your company or organization uses
Python and would benefit from HTTP/2 support in Requests, pip, cloud
SDKs, and thousands of other projects <a
href="https://opencollective.com/urllib3">please consider contributing
financially</a> to ensure HTTP/2 support is developed sustainably and
maintained for the long-haul.</p>
<p>Thank you for your support.</p>
<h2>Changes</h2>
<ul>
<li>Added the <code>Proxy-Authorization</code> header to the list of
headers to strip from requests when redirecting to a different host. As
before, different headers can be set via
<code>Retry.remove_headers_on_redirect</code>.</li>
<li>Allowed passing negative integers as <code>amt</code> to read
methods of <code>http.client.HTTPResponse</code> as an alternative to
<code>None</code>. (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3122">#3122</a>)</li>
<li>Fixed return types representing copying actions to use
<code>typing.Self</code>. (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3363">#3363</a>)</li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/urllib3/urllib3/compare/2.2.1...2.2.2">https://github.com/urllib3/urllib3/compare/2.2.1...2.2.2</a></p>
<h2>2.2.1</h2>
<h2>🚀 urllib3 is fundraising for HTTP/2 support</h2>
<p><a
href="https://sethmlarson.dev/urllib3-is-fundraising-for-http2-support">urllib3
is raising ~$40,000 USD</a> to release HTTP/2 support and ensure
long-term sustainable maintenance of the project after a sharp decline
in financial support for 2023. If your company or organization uses
Python and would benefit from HTTP/2 support in Requests, pip, cloud
SDKs, and thousands of other projects <a
href="https://opencollective.com/urllib3">please consider contributing
financially</a> to ensure HTTP/2 support is developed sustainably and
maintained for the long-haul.</p>
<p>Thank you for your support.</p>
<h2>Changes</h2>
<ul>
<li>Fixed issue where <code>InsecureRequestWarning</code> was emitted
for HTTPS connections when using Emscripten. (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3331">#3331</a>)</li>
<li>Fixed <code>HTTPConnectionPool.urlopen</code> to stop automatically
casting non-proxy headers to <code>HTTPHeaderDict</code>. This change
was premature as it did not apply to proxy headers and
<code>HTTPHeaderDict</code> does not handle byte header values correctly
yet. (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3343">#3343</a>)</li>
<li>Changed <code>ProtocolError</code> to
<code>InvalidChunkLength</code> when response terminates before the
chunk length is sent. (<a
href="https://redirect.github.com/urllib3/urllib3/issues/2860">#2860</a>)</li>
<li>Changed <code>ProtocolError</code> to be more verbose on incomplete
reads with excess content. (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3261">#3261</a>)</li>
</ul>
<h2>2.2.0</h2>
<h2>🖥️ urllib3 now works in the browser</h2>
<p>:tada: <strong>This release adds experimental support for <a
href="https://urllib3.readthedocs.io/en/stable/reference/contrib/emscripten.html">using
urllib3 in the browser with Pyodide</a>!</strong> 🎉</p>
<p>Thanks to Joe Marshall (<a
href="https://github.com/joemarshall"><code>@​joemarshall</code></a>)
for contributing this feature. This change was possible thanks to work
done in urllib3 v2.0 to detach our API from <code>http.client</code>.
Please report all bugs to the <a
href="https://github.com/urllib3/urllib3/issues">urllib3 issue
tracker</a>.</p>
<h2>🚀 urllib3 is fundraising for HTTP/2 support</h2>
<p><a
href="https://sethmlarson.dev/urllib3-is-fundraising-for-http2-support">urllib3
is raising ~$40,000 USD</a> to release HTTP/2 support and ensure
long-term sustainable maintenance of the project after a sharp decline
in financial support for 2023. If your company or organization uses
Python and would benefit from HTTP/2 support in Requests, pip, cloud
SDKs, and thousands of other projects <a
href="https://opencollective.com/urllib3">please consider contributing
financially</a> to ensure HTTP/2 support is developed sustainably and
maintained for the long-haul.</p>
<p>Thank you for your support.</p>
<h2>Changes</h2>
<ul>
<li>Added support for <a
href="https://urllib3.readthedocs.io/en/latest/reference/contrib/emscripten.html">Emscripten
and Pyodide</a>, including streaming support in cross-origin isolated
browser environments where threading is enabled. (<a
href="https://redirect.github.com/urllib3/urllib3/issues/2951">#2951</a>)</li>
<li>Added support for <code>HTTPResponse.read1()</code> method. (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3186">#3186</a>)</li>
<li>Added rudimentary support for HTTP/2. (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3284">#3284</a>)</li>
<li>Fixed issue where requests against urls with trailing dots were
failing due to SSL errors
when using proxy. (<a
href="https://redirect.github.com/urllib3/urllib3/issues/2244">#2244</a>)</li>
<li>Fixed <code>HTTPConnection.proxy_is_verified</code> and
<code>HTTPSConnection.proxy_is_verified</code> to be always set to a
boolean after connecting to a proxy. It could be <code>None</code> in
some cases previously. (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3130">#3130</a>)</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/urllib3/urllib3/blob/main/CHANGES.rst">urllib3's
changelog</a>.</em></p>
<blockquote>
<h1>2.2.2 (2024-06-17)</h1>
<ul>
<li>Added the <code>Proxy-Authorization</code> header to the list of
headers to strip from requests when redirecting to a different host. As
before, different headers can be set via
<code>Retry.remove_headers_on_redirect</code>.</li>
<li>Allowed passing negative integers as <code>amt</code> to read
methods of <code>http.client.HTTPResponse</code> as an alternative to
<code>None</code>.
(<code>[#3122](urllib3/urllib3#3122)
&lt;https://github.com/urllib3/urllib3/issues/3122&gt;</code>__)</li>
<li>Fixed return types representing copying actions to use
<code>typing.Self</code>.
(<code>[#3363](urllib3/urllib3#3363)
&lt;https://github.com/urllib3/urllib3/issues/3363&gt;</code>__)</li>
</ul>
<h1>2.2.1 (2024-02-16)</h1>
<ul>
<li>Fixed issue where <code>InsecureRequestWarning</code> was emitted
for HTTPS connections when using Emscripten.
(<code>[#3331](urllib3/urllib3#3331)
&lt;https://github.com/urllib3/urllib3/issues/3331&gt;</code>__)</li>
<li>Fixed <code>HTTPConnectionPool.urlopen</code> to stop automatically
casting non-proxy headers to <code>HTTPHeaderDict</code>. This change
was premature as it did not apply to proxy headers and
<code>HTTPHeaderDict</code> does not handle byte header values correctly
yet. (<code>[#3343](urllib3/urllib3#3343)
&lt;https://github.com/urllib3/urllib3/issues/3343&gt;</code>__)</li>
<li>Changed <code>InvalidChunkLength</code> to
<code>ProtocolError</code> when response terminates before the chunk
length is sent.
(<code>[#2860](urllib3/urllib3#2860)
&lt;https://github.com/urllib3/urllib3/issues/2860&gt;</code>__)</li>
<li>Changed <code>ProtocolError</code> to be more verbose on incomplete
reads with excess content.
(<code>[#3261](urllib3/urllib3#3261)
&lt;https://github.com/urllib3/urllib3/issues/3261&gt;</code>__)</li>
</ul>
<h1>2.2.0 (2024-01-30)</h1>
<ul>
<li>Added support for <code>Emscripten and Pyodide
&lt;https://urllib3.readthedocs.io/en/latest/reference/contrib/emscripten.html&gt;</code><strong>,
including streaming support in cross-origin isolated browser
environments where threading is enabled.
(<code>[#2951](urllib3/urllib3#2951)
&lt;https://github.com/urllib3/urllib3/issues/2951&gt;</code></strong>)</li>
<li>Added support for <code>HTTPResponse.read1()</code> method.
(<code>[#3186](urllib3/urllib3#3186)
&lt;https://github.com/urllib3/urllib3/issues/3186&gt;</code>__)</li>
<li>Added rudimentary support for HTTP/2.
(<code>[#3284](urllib3/urllib3#3284)
&lt;https://github.com/urllib3/urllib3/issues/3284&gt;</code>__)</li>
<li>Fixed issue where requests against urls with trailing dots were
failing due to SSL errors
when using proxy.
(<code>[#2244](urllib3/urllib3#2244)
&lt;https://github.com/urllib3/urllib3/issues/2244&gt;</code>__)</li>
<li>Fixed <code>HTTPConnection.proxy_is_verified</code> and
<code>HTTPSConnection.proxy_is_verified</code>
to be always set to a boolean after connecting to a proxy. It could be
<code>None</code> in some cases previously.
(<code>[#3130](urllib3/urllib3#3130)
&lt;https://github.com/urllib3/urllib3/issues/3130&gt;</code>__)</li>
<li>Fixed an issue where <code>headers</code> passed in a request with
<code>json=</code> would be mutated
(<code>[#3203](urllib3/urllib3#3203)
&lt;https://github.com/urllib3/urllib3/issues/3203&gt;</code>__)</li>
<li>Fixed <code>HTTPSConnection.is_verified</code> to be set to
<code>False</code> when connecting
from a HTTPS proxy to an HTTP target. It was set to <code>True</code>
previously.
(<code>[#3267](urllib3/urllib3#3267)
&lt;https://github.com/urllib3/urllib3/issues/3267&gt;</code>__)</li>
<li>Fixed handling of new error message from OpenSSL 3.2.0 when
configuring an HTTP proxy as HTTPS
(<code>[#3268](urllib3/urllib3#3268)
&lt;https://github.com/urllib3/urllib3/issues/3268&gt;</code>__)</li>
<li>Fixed TLS 1.3 post-handshake auth when the server certificate
validation is disabled
(<code>[#3325](urllib3/urllib3#3325)
&lt;https://github.com/urllib3/urllib3/issues/3325&gt;</code>__)</li>
<li>Note for downstream distributors: To run integration tests, you now
need to run the tests a second
time with the <code>--integration</code> pytest flag.
(<code>[#3181](urllib3/urllib3#3181)
&lt;https://github.com/urllib3/urllib3/issues/3181&gt;</code>__)</li>
</ul>
<h1>2.1.0 (2023-11-13)</h1>
<ul>
<li>Removed support for the deprecated urllib3[secure] extra.
(<code>[#2680](urllib3/urllib3#2680)
&lt;https://github.com/urllib3/urllib3/issues/2680&gt;</code>__)</li>
<li>Removed support for the deprecated SecureTransport TLS
implementation.
(<code>[#2681](urllib3/urllib3#2681)
&lt;https://github.com/urllib3/urllib3/issues/2681&gt;</code>__)</li>
<li>Removed support for the end-of-life Python 3.7.
(<code>[#3143](urllib3/urllib3#3143)
&lt;https://github.com/urllib3/urllib3/issues/3143&gt;</code>__)</li>
<li>Allowed loading CA certificates from memory for proxies.
(<code>[#3065](urllib3/urllib3#3065)
&lt;https://github.com/urllib3/urllib3/issues/3065&gt;</code>__)</li>
<li>Fixed decoding Gzip-encoded responses which specified
<code>x-gzip</code> content-encoding.
(<code>[#3174](urllib3/urllib3#3174)
&lt;https://github.com/urllib3/urllib3/issues/3174&gt;</code>__)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/urllib3/urllib3/commit/27e2a5c5a7ab6a517252cc8dcef3ffa6ffb8f61a"><code>27e2a5c</code></a>
Release 2.2.2 (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3406">#3406</a>)</li>
<li><a
href="https://github.com/urllib3/urllib3/commit/accff72ecc2f6cf5a76d9570198a93ac7c90270e"><code>accff72</code></a>
Merge pull request from GHSA-34jh-p97f-mpxf</li>
<li><a
href="https://github.com/urllib3/urllib3/commit/34be4a57e59eb7365bcc37d52e9f8271b5b8d0d3"><code>34be4a5</code></a>
Pin CFFI to a new release candidate instead of a Git commit (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3398">#3398</a>)</li>
<li><a
href="https://github.com/urllib3/urllib3/commit/da410581b6b3df73da976b5ce5eb20a4bd030437"><code>da41058</code></a>
Bump browser-actions/setup-chrome from 1.6.0 to 1.7.1 (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3399">#3399</a>)</li>
<li><a
href="https://github.com/urllib3/urllib3/commit/b07a669bd970d69847801148286b726f0570b625"><code>b07a669</code></a>
Bump github/codeql-action from 2.13.4 to 3.25.6 (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3396">#3396</a>)</li>
<li><a
href="https://github.com/urllib3/urllib3/commit/b8589ec9f8c4da91511e601b632ac06af7e7c10e"><code>b8589ec</code></a>
Measure coverage with v4 of artifact actions (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3394">#3394</a>)</li>
<li><a
href="https://github.com/urllib3/urllib3/commit/f3bdc5585111429e22c81b5fb26c3ec164d98b81"><code>f3bdc55</code></a>
Allow triggering CI manually (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3391">#3391</a>)</li>
<li><a
href="https://github.com/urllib3/urllib3/commit/52392654b30183129cf3ec06010306f517d9c146"><code>5239265</code></a>
Fix HTTP version in debug log (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3316">#3316</a>)</li>
<li><a
href="https://github.com/urllib3/urllib3/commit/b34619f94ece0c40e691a5aaf1304953d88089de"><code>b34619f</code></a>
Bump actions/checkout to 4.1.4 (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3387">#3387</a>)</li>
<li><a
href="https://github.com/urllib3/urllib3/commit/9961d14de7c920091d42d42ed76d5d479b80064d"><code>9961d14</code></a>
Bump browser-actions/setup-chrome from 1.5.0 to 1.6.0 (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3386">#3386</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/urllib3/urllib3/compare/2.0.7...2.2.2">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=urllib3&package-manager=pip&previous-version=2.0.7&new-version=2.2.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/googleapis/google-cloud-python/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the BigQuery API.
Projects
None yet
Development

No branches or pull requests

3 participants