判断文本文件的编码类型


判断文本文件的编码类型
项目:Spring Boot 项目
在 pom.xml 文件中引入依赖:
<!-- https://mvnrepository.com/artifact/com.googlecode.juniversalchardet/juniversalchardet -->
<dependency>
<groupId>com.googlecode.juniversalchardet</groupId>
<artifactId>juniversalchardet</artifactId>
<version>1.0.3</version>
</dependency>
public static String detector(String fileName) {
String encode = null;
BufferedInputStream bis = null;
try {
bis = new BufferedInputStream(new FileInputStream(fileName));
int readSize;
byte[] buffer = new byte[8 * 4096];
UniversalDetector detector = new UniversalDetector(null);
while ((readSize = bis.read(buffer)) > 0 && !detector.isDone()) {
detector.handleData(buffer, 0, readSize);
}
detector.dataEnd();
encode = detector.getDetectedCharset();
detector.reset();
} catch (Exception e) {
} finally {
if (bis != null) {
try {
bis.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
//return encode;
return encode.equalsIgnoreCase("UTF-8") ? "UTF-8" : "GBK";
}
扫码分享
版权说明
作者:SQBER
本文版权归作者所有,欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出原文连接,否则保留追究法律责任的权利。
{0}
{5}
{1}
{2}回复
{4}
*昵称:
*邮箱:
个人站点:
*想说的话: