Google BigQuery Experience:
Ingestion to Google Cloud Storage Bucket from data Landing Pad using gsutil
$gsutil cp file.txt gs://bucket/folder #copy file to gcp bucket
$gsutil -m cp file* gs://bucket/folder #Copy all files in Parallel
Load to BigQuery from GCP Bucket
#Loading Complete line into one column table using line delimiter
$bq load --field_delimiter=$(echo -en "\x01") --skip_leading_rows=1 --max_bad_records=1 \
--noautodetect --source_format=CSV dataset.tableNAME gs://bucket/folder line:string
#Loading filed delimiter ~ file to table
$bq load -F "~" --skip_leading_rows=1 --allow_jagged_rows \
--source_format=CSV dataset.tableNM \
gs://bucket/folder col1:string,col2:date
Uncompress files in GCP storage
$gsutil cat gs://bucket/file.txt.gz | zcat | gsutil cp - gs://bucket/file.txt
$gsutil -Z cp gs://bucket/file.txt.gz gs://bucket/file.txt
Join files less than 30 count to single file using compose
$gsutil compose gs://bucket/file.* gs://bucket/file.txt
#unix command to split file (split files will have extension _aa,_ab)
$split -b 10G file.txt file.txt_
Uncompress files in GCP storage
$gsutil cat gs://bucket/file.txt.gz | zcat | gsutil cp - gs://bucket/file.txt
$gsutil -Z cp gs://bucket/file.txt.gz gs://bucket/file.txt
Show Metadata and Update
$bq show --format=prettyjson project:dataset > update.json
$bq update --source update.json project:dataset #change specialGroup to useremail
gcloud change or autorize project
$gcloud init
Pick configuration to use:
[1] Re-initialize this configuration [default] with new settings
[2] Create a new configuration
please enter your numeric choice:1
Do you have a network proxy you would like to set in gcloud (Y/n)? n
Would you like to continue anyway (y/N)? y
Choose the account you would like to use to perform operations for this configuration
[1] existing email if any
[2] Log in with a new account
Please enter your numeric choice:2
<Authenticate the Account using Google Authentication UI >
you are logged in as [email_id]
This account has a lot of projects! Listing them all can take a while
[1] Enter a project ID
[2] Create a new project
[3] List projects
Please enter your numeric choice:1
Enter an existing project id you would like to use:<project_id>
Your current project has been set to :[<project_id]
Comments
Post a Comment