파싱하는데 한글 깨짐 질문합니다

위드

https://www.androidpub.com/android_dev_qna/551887

2010.07.17 15:53:30

3691

코딩을 아래와 같이 하였는데요
이게 신기한게...
html파일을 가져와서 불러올때는 아무 이상이 없는데
xhtml파일을 불러오면 한글이 다 깨져서 나오고 있네요
html파일도 EUC-KR이고 xhtml도 EUC-KR인데요
유독 xhtml에서만 깨지고 있습니다
그래서 나름 설정을 해보려고 입력을
m_searchTxt = URLEncoder.encode(sourceUrlString,"euc-kr");
source = new Source(new URL(m_searchTxt));
이런식으로 바꿔도 봤는데 이렇게 하면 주소 자체를 바꾸는 거라 주소에서 :하고 /가 깨져 들어가 오류가 생기더군요.ㅠ
어떻게 변경을 해야 한글이 안깨질 수 있을까요.ㅠ
입력이 잘못되도 출력은 제대로 될 것 같아서
fix = new String(list.get(1).getBytes("EUC_KR"));
이런식으로 리턴 받아서 출력했는데 그래도 결과는 똑같더군요
죽겠습니다 살려주세요.

public ArrayList<String> getHtmlToText(String sourceUrlString) {
  Source source = null;
  String content = null;
  try {
   source = new Source(new URL(sourceUrlString));
   source.fullSequentialParse();
   Pattern ptn = Pattern.compile("<td.*?>(.+?)<\\/td>", Pattern.CASE_INSENSITIVE + Pattern.DOTALL);
   Matcher mch = ptn.matcher(source);
   while (mch.find()) {
    list.add(mch.group(1));
   }
  } catch (MalformedURLException e) {
   e.printStackTrace();
  } catch (IOException e) {
   e.printStackTrace();
  }
  return list;
 }

이 게시물을

2010.07.18 13:09:17

Darklake

혹시요~ 이클립스의 프로젝트 언어 설정은 뭘로 되어 있나요?

2010.07.20 19:31:44

위드

ImputStreamReader isr = new InputStreamReader(url.openStream(), "EUC-KR");

이렇게 url 입력 받아서 읽는 설정 잡아두니 해결 되었네요~

Karklake님 조언 감사합니다~