请让我知道这是否适合你。
input.txt
1999-01-01 12:08:56
1999-01-02 12:08:57
1999-01-03 12:08:58
1999-01-04 12:08:59
PigScript:
A = LOAD 'input.txt' using PigStorage(' ') as(date:chararray,time:chararray);
B = FOREACH A GENERATE CONCAT(date,'T',time) as myDateString;
C = FOREACH B GENERATE ToDate(myDateString);
dump C;
Output:
(1999-01-01T12:08:56.000+05:30)
(1999-01-02T12:08:57.000+05:30)
(1999-01-03T12:08:58.000+05:30)
(1999-01-04T12:08:59.000+05:30)
Now the myDateString is in date object, you can process this data using all the build in date functions.
Incase if you want to store the output as in this format
(1999-01-01T12:08:56)
(1999-01-02T12:08:57)
(1999-01-03T12:08:58)
(1999-01-04T12:08:59)
you can use REGEX_EXTRACT to parse the each data till "." something like this
D = FOREACH C GENERATE ToString($0) as temp;
E = FOREACH D GENERATE REGEX_EXTRACT(temp, '(.*)\\.(.*)', 1);
dump E;
Output:
(1999-01-01T12:08:56)
(1999-01-02T12:08:57)
(1999-01-03T12:08:58)
(1999-01-04T12:08:59)
我对阿帕奇PIG一无所知。但是,如果您的问题仅仅是如何获取表示日期和字符串的字符串来表示日期时间,则可以将这两个字符串合并为一个字符串来解析为日期时间值(对象),可能将该值调整为另一个值时区(如UTC),然后将该值序列化为不同格式的字符串以表示组合的日期时间值......至少在StackOverflow中至少有一千个问题和答案。我只是给了你需要搜索的关键词,以及'joda'和'java.time'。 – 2014-09-23 18:38:01
[解析日期字符串到某些Java对象]的可能重复(http://stackoverflow.com/questions/8854780/parse-date-string-to-some-java-object) – 2014-09-23 18:41:55