Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.

Unlimited Access

Get Unlimited Contributor Access to the all ExamTopics Exams!
Take advantage of PDF Files for 1000+ Exams along with community discussions and pass IT Certification Exams Easily.

Exam Certified Machine Learning Associate topic 1 question 29 discussion

Actual exam question from Databricks's Certified Machine Learning Associate
Question #: 29
Topic #: 1
[All Certified Machine Learning Associate Questions]

A data scientist has been given an incomplete notebook from the data engineering team. The notebook uses a Spark DataFrame spark_df on which the data scientist needs to perform further feature engineering. Unfortunately, the data scientist has not yet learned the PySpark DataFrame API.
Which of the following blocks of code can the data scientist run to be able to use the pandas API on Spark?

  • A. import pyspark.pandas as ps
    df = ps.DataFrame(spark_df)
  • B. import pyspark.pandas as ps
    df = ps.to_pandas(spark_df)
  • C. spark_df.to_sql()
  • D. import pandas as pd
    df = pd.DataFrame(spark_df)
  • E. spark_df.to_pandas()
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
rajneesharora
3 days, 5 hours ago
A is correct
upvoted 1 times
...
68c6a4b
2 weeks, 5 days ago
It's not A. E. spark_df.to_pandas() Here's why: The to_pandas() method is a built-in method of the PySpark DataFrame API. It converts a Spark DataFrame to a pandas DataFrame. By calling spark_df.to_pandas(), the data scientist can convert the Spark DataFrame spark_df to a pandas DataFrame, allowing them to use the familiar pandas API for further feature engineering. The resulting pandas DataFrame will be stored in memory on the driver node, so this approach is suitable when the data size is relatively small and can fit in the memory of the driver.
upvoted 1 times
rajneesharora
3 days, 5 hours ago
E is not correct as to_pandas would convert into pandas DF, while what is given is a Spark DF
upvoted 1 times
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
ex Want to SAVE BIG on Certification Exam Prep?
close
ex Unlock All Exams with ExamTopics Pro 75% Off
  • arrow Choose From 1000+ Exams
  • arrow Access to 10 Exams per Month
  • arrow PDF Format Available
  • arrow Inline Discussions
  • arrow No Captcha/Robot Checks
Limited Time Offer
Ends in