Questions tagged [databricks]

For questions about the Databricks Unified Analytics Platform

Filter by
Sorted by
Tagged with
-1
votes
0answers
13 views

Interactive pivot grid with Spark/Databricks. Suggest technologies/approach? [closed]

I have a goal to develop an application which would display large amount of data in an interactive pivot grid (web or desktop), the data would be coming from Databricks. I'm looking for advises or ...
-1
votes
0answers
8 views

Databricks debug mode to display DFs [closed]

When we run a databricks notebook in order to improve performance we remove all the display commands. But when we what to debug then we would need display commands to verify the data. How can we ...
1
vote
1answer
15 views

Parsing Event Hub Complex Array Type messages using spark streaming

I need to parse the Array Type in body while reading from eventhub. we have nested json message but not able to parse the same: {"Name": "Rohit","Salary": "29292&...
0
votes
0answers
16 views

Error - java.lang.NoClassDefFoundError: com/microsoft/sqlserver/jdbc/ISQLServerBulkData

I'm getting below error connecting to Azure SQL Server database when using sql spark connector with DataBricks notebook. Error: java.lang.NoClassDefFoundError: com/microsoft/sqlserver/jdbc/...
-1
votes
0answers
17 views

Databricks SQL Group By [closed]

I am trying to find the highest home run totals for the month of October and group it by player name in Databricks SQL. I have tried using the following code, but I have had no luck running it ...
0
votes
1answer
22 views

Connection with client not establishing with socket.accept()

I'm doing a project for class where I stream data from Twitter using Databricks and when it reaches s.accept() it seems to get stuck there, running indefinitely: Code: def sendTweets(c_socket): auth ...
0
votes
1answer
39 views

Adding filename to PySpark RDD [closed]

I'm trying to add the filename to a pyspark.sql.dataframe.DataFrame when I import JSON files from the local dbfs inside a databricks notebook. It is turning into a more difficult effort than I thought....
1
vote
1answer
21 views

How to prevent spark parquet scan on every query

I created a delta lake table in databricks using a SQL command like the following: CREATE TABLE mytable USING DELTA LOCATION '/mnt/s3-mount-point/mytable/' AS SELECT A, B, C FROM t1 I then ...
0
votes
1answer
21 views

How to use a spark dataframe as a table in a SQL statement

I have a spark dataframe in python. How do I use it in a sparksql statement? For example: df = spark.createDataFrame(data = array_of_table_and_time_tuples , schema = ...
-1
votes
1answer
24 views

How to access secrets in databricks initscript

I have tried to access the secret {{secrets/secrectScope/Key}} in advanced tab of databricks cluster and it is working fine. But when I try to use the same in databricks init script, it is not working ...
0
votes
0answers
35 views

Heap space error and Connection timeout issue in Spark on Databricks

I am running Spark Job on Azure Databricks (Spark 3.0.1 and Scala 2.12). I have 3 worker nodes with 20 cores and 140 GB memory each and driver node with 3 cores and 32 GB memory. I am using following ...
0
votes
0answers
33 views

Databricks delta table truncating column data containing '-'

I am using a delta table to load data from my dataframe. I am observing that the column values which have a '-' in them, are getting truncated. I tried to check the records in the dataframe that I am ...
0
votes
1answer
25 views

How to access AWS public dataset using Databricks?

For one of my classes, I have to analyze a "big data" dataset. I found the following dataset on the AWS Registry of Open Data that seems interesting: https://registry.opendata.aws/openaq/ ...
1
vote
2answers
33 views

How can I connect Jmeter with Databricks spark cluster

I want to connect Jmeter with Databricks (Spark Cluster) using JDBC connection associated with that spark Cluster I need to perform a concurrency test using Jmeter's JDBC request on a apache spark ...
0
votes
1answer
47 views

Is there a way to join two datasets on timestamp with an offset such that it connects time_1 with time_2 where time_2 is 2hrs earlier than time_1?

I'm trying to predict delays based on weather 2 hours before scheduled travel. I have one dataset of travel data (call df1) and one dataset of weather (call df2). In order to predict the delay, I am ...

15 30 50 per page
1
2 3 4 5
188
 
久草视频新免费_日韩视频一中文字暮_欧美三级片