NodePad++里面的数据(ANSI编码)
00:00:002982199073774412[360安全卫士]8 3
Eclipse 用textFile读出来是
代码:
JavaRDD<String> lines = sc.textFile("D:/Scala_Eclipse/SogouQ.reduced", 1);
JavaRDD<String> a = lines.map(new Function<String, String>() {
@Override
public String call(String arg0) throws Exception {
String dd = getEncoding(arg0);
String h = new String(arg0.getBytes("UTF-8"),"GBK");
return h;
}
});
结果
arg0 = 360��ȫ��
h = 360锟斤拷全锟斤拷士
00:00:002982199073774412[360安全卫士]8 3
Eclipse 用textFile读出来是
代码:
JavaRDD<String> lines = sc.textFile("D:/Scala_Eclipse/SogouQ.reduced", 1);
JavaRDD<String> a = lines.map(new Function<String, String>() {
@Override
public String call(String arg0) throws Exception {
String dd = getEncoding(arg0);
String h = new String(arg0.getBytes("UTF-8"),"GBK");
return h;
}
});
结果
arg0 = 360��ȫ��
h = 360锟斤拷全锟斤拷士