pyspark.sql.functions.xpath#

pyspark.sql.functions.xpath(xml, path)[source]#

Returns a string array of values within the nodes of xml that match the XPath expression.

New in version 3.5.0.

Examples

>>> from pyspark.sql import functions as sf
>>> df = spark.createDataFrame(
...     [('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>',)], ['x'])
>>> df.select(sf.xpath(df.x, sf.lit('a/b/text()'))).show()
+--------------------+
|xpath(x, a/b/text())|
+--------------------+
|        [b1, b2, b3]|
+--------------------+