Experience Using Google BigQuery

Google BigQuery Experience:

Ingestion to Google Cloud Storage Bucket from data Landing Pad using gsutil

$gsutil cp file.txt gs://bucket/folder  #copy file to gcp bucket
$gsutil -m cp file* gs://bucket/folder #Copy all files in Parallel                  

Load to BigQuery from GCP Bucket

#Loading Complete line into one column table using line delimiter
$bq load --field_delimiter=$(echo -en "\x01") --skip_leading_rows=1 --max_bad_records=1 \
  --noautodetect --source_format=CSV dataset.tableNAME gs://bucket/folder line:string                  

#Loading filed delimiter ~ file to table
$bq load -F "~" --skip_leading_rows=1 --allow_jagged_rows \ 
    --source_format=CSV dataset.tableNM \
    gs://bucket/folder col1:string,col2:date

Uncompress files in GCP storage

$gsutil cat gs://bucket/file.txt.gz | zcat | gsutil cp - gs://bucket/file.txt
$gsutil -Z cp gs://bucket/file.txt.gz  gs://bucket/file.txt         

Join files less than 30 count to single file using compose

$gsutil compose gs://bucket/file.* gs://bucket/file.txt
#unix command to split file (split files will have extension _aa,_ab)
$split -b 10G file.txt file.txt_  

Uncompress files in GCP storage

$gsutil cat gs://bucket/file.txt.gz | zcat | gsutil cp - gs://bucket/file.txt
$gsutil -Z cp gs://bucket/file.txt.gz  gs://bucket/file.txt         

Show Metadata and Update

$bq show --format=prettyjson project:dataset > update.json
$bq update --source update.json project:dataset #change specialGroup to useremail

gcloud change or autorize project

$gcloud init
Pick configuration to use:
[1] Re-initialize this configuration [default] with new settings
[2] Create a new configuration
please enter your numeric choice:1
Do you have a network proxy you would like to set in gcloud (Y/n)? n
Would you like to continue anyway (y/N)? y
Choose the account you would like to use to perform operations for this configuration
[1] existing email if any
[2] Log in with a new account
Please enter your numeric choice:2
<Authenticate the Account using Google Authentication UI >
you are logged in as [email_id]
This account has a lot of projects! Listing them all can take a while
[1] Enter a project ID
[2] Create a new project
[3] List projects
Please enter your numeric choice:1
Enter an existing project id you would like to use:<project_id>
Your current project has been set to :[<project_id]

A Journey to Data Architect ..

Search This Blog

Experience Using Google BigQuery

Ingestion to Google Cloud Storage Bucket from data Landing Pad using gsutil

Load to BigQuery from GCP Bucket

Uncompress files in GCP storage

Join files less than 30 count to single file using compose

Uncompress files in GCP storage

Show Metadata and Update

gcloud change or autorize project

Labels

Comments

Post a Comment

Popular posts from this blog

LookML

A Deep Dive Into Google BigQuery Architecture

Data Warehouse 101 - Part 2