Apache commons IO工具包的使用

IOUtils

IOUtils 包含工具方法用于处理io的读与写和拷贝 .这些方法工作于InputStream,OutputStream,Reader,Writer

As an example, consider the task of reading bytes from a URL, and printing them. This would typically done like this:

举个例子:通常从一个URL中读取字节数据,再打印它们,它将这样写:

 1 @Test
 2     public void ioGeneric() throws IOException {
 3         InputStream in = new URL( "http://badu.com" ).openStream();
 4          try {
 5            InputStreamReader inR = new InputStreamReader( in );
 6            BufferedReader buf = new BufferedReader( inR );
 7            String line;
 8            while ( ( line = buf.readLine() ) != null ) {
 9              System.out.println( line );
10            }
11          } finally {
12            in.close();
13          }
14     }

View Code

而用IOUtils class将可以这样写:

1 @Test
2     public void io() throws IOException {
3         InputStream in = new URL( "http://badu.com" ).openStream();
4          try {
5            System.err.println(IOUtils.toString(in));
6          } finally {
7            IOUtils.closeQuietly(in);
8          }
9     }

In certain application domains, such IO operations are common, and this class can save a great deal of time. And you can rely on well-tested code.

For utility code such as this, flexibility and speed are of primary importance. However you should also understand the limitations of this approach. Using the above technique to read a 1GB file would result in an attempt to create a 1GB String object!

在某些应用领域,这样的IO操作是很常见的,上面的类可以节约大量的时间,你也可以依靠良好的测试代码?(这句咋翻译)

对于像这样的工具类代码,灵活,快速是基本的也是最重要的.然而,你也应该明白这种途径的缺陷(限制),使用上面的技术读取1GB文件结果将试图创建1GB字符串对象!

FileUtils

The FileUtils class contains utility methods for working with File objects. These include reading, writing, copying and comparing files.

For example to read an entire file line by line you could use:

文件工具类包括用于处理文件对象的工具方法.包括读取,写入,拷贝和比较文件

举个例子使用行读取一个一个完整的文件,你可以这样写:

File file = new File("/commons/io/project.properties");
List lines = FileUtils.readLines(file, "UTF-8");

FilenameUtils

The FilenameUtils class contains utility methods for working with filenames without using File objects. The class aims to be consistent between Unix and Windows, to aid transitions between these environments (such as moving from development to production).

For example to normalize a filename removing double dot segments:

FilenameUtils中的工具方法处理文件名不用使用文件对象,该类目标用于Unix与windows之间,帮助转换两者之间的环境(从开发环境换到产品上线)

举个例子,正常的文件名是没有双斜线的

 String filename = "C:/commons/io/../lang/project.xml";
 String normalized = FilenameUtils.normalize(filename);
 // result is "C:/commons/lang/project.xml"

Line iterator

The org.apache.commons.io.LineIterator class provides a flexible way for working with a line-based file. An instance can be created directly, or via factory methods on FileUtils orIOUtils. The recommended usage pattern is:

lineIterator提供了灵活的方式处理基于行处理的文件.这样可以直接创建实例,或则由FileUtils、IOUtils的工厂方法创建
推荐这样:

 1 @Test
 2     public void iterator() throws IOException {
 3         File file=new File("d:/.myeclipse.properties");
 4         LineIterator it = FileUtils.lineIterator(file, "UTF-8");
 5          try {
 6            while (it.hasNext()) {
 7              String line = it.nextLine();
 8              /// do something with line
 9              System.err.println(line);
10            }
11          } finally {
12            LineIterator.closeQuietly(it);
13          }
14     }

Best practices

Often, you have to deal with files and filenames. There are many things that can go wrong:

A class works in Unix but doesn't on Windows (or vice versa)
Invalid filenames due to double or missing path separators
UNC filenames (on Windows) don't work with my home-grown filename utility function
etc. etc.

These are good reasons not to work with filenames as Strings. Using java.io.File instead handles many of the above cases nicely. Thus, our best practice recommendation is to use java.io.File instead of String for filenames to avoid platform dependencies.

Version 1.1 of commons-io now includes a dedicated filename handling class - FilenameUtils. This does handle many of these filename issues, however we still recommend, wherever possible, that you use java.io.File objects.

Let's look at an example.

通常,你不得不处理文件和文件名,以下有几点容易出错:

一个类能在Unix上工作,但不能在Windows上工作(反之亦然)
无效的文件名由于双斜杠或者丢失路径符号
通用命名规约的文件名(在windows上)不能正常工作在我本地的文件功能?咋翻译...
等等..

这里有很多理由不使用字符串来作为文件名工作,而是去使用java.io.File很好的处理上诉的问题.因此,我们的最佳实践是使用java.io.File代替字符串命名来避免不同的平台依赖.
Version 1.1 of commons-io

现在包括专门处理文件名的类 FIlenameUtils. 它能处理很多文件名问题
然而我们仍然推荐还是尽可能的使用java.io.File对象

1  public static String getExtension(String filename) {
2    int index = filename.lastIndexOf('.');
3    if (index == -1) {
4      return "";
5    } else {
6      return filename.substring(index + 1);
7    }
8  }

Easy enough? Right, but what happens if someone passes in a full path instead of only a filename? Consider the following, perfectly legal path: "C:\Temp\documentation.new\README". The method as defined above would return "new\README" - definitely not what you wanted.

Please use java.io.File for filenames instead of Strings. The functionality that the class provides is well tested. In FileUtils you will find other useful utility functions around java.io.File.

Instead of:

这样简单吗?是的,但是如果有人通过完整路径,而不是只有一个文件名?考虑以下,完全合法的路径:"C:\Temp\documentation.new\README".

上面定义的方法就会返回 "new\README" - 肯定不是你想要的东西.

请使用java.io.File处理文件名代替字符串.类提供的功能测试。在FileUtils中你将找到有用的工具方法环绕在java.io.File中?....咋翻译

而不是:

 String tmpdir = "/var/tmp";
 String tmpfile = tmpdir + System.getProperty("file.separator") + "test.tmp";
 InputStream in = new java.io.FileInputStream(tmpfile);

write:

 File tmpdir = new File("/var/tmp");
 File tmpfile = new File(tmpdir, "test.tmp");
 InputStream in = new java.io.FileInputStream(tmpfile);

Buffering streams

IO performance depends a lot from the buffering strategy. Usually, it's quite fast to read packets with the size of 512 or 1024 bytes because these sizes match well with the packet sizes used on harddisks in file systems or file system caches. But as soon as you have to read only a few bytes and that many times performance drops significantly.

Make sure you're properly buffering streams when reading or writing streams, especially when working with files. Just decorate your FileInputStream with a BufferedInputStream:

IO变现依靠大量的缓存策略.通常,读取数据包在512 - 1024 byte之间是很快速的.因为这些大小匹配使用的数据包大小在硬盘上的文件系统或文件系统缓存着.但你若读取较小的字节很多次,,性能就会了显著下降。

读写流的时候,你应当确保是适当地使用 buffering streams.特别是文件处理的时候.给你的FileInputStream 使用上BufferedInputStream.

 InputStream in = new java.io.FileInputStream(myfile);
 try {
   in = new java.io.BufferedInputStream(in);
   
   in.read(.....
 } finally {
   IOUtils.closeQuietly(in);
 }

Pay attention that you're not buffering an already buffered stream. Some components like XML parsers may do their own buffering so decorating the InputStream you pass to the XML parser does nothing but slowing down your code. If you use our CopyUtils or IOUtils you don't need to additionally buffer the streams you use as the code in there already buffers the copy process. Always check the Javadocs for information. Another case where buffering is unnecessary is when you write to a ByteArrayOutputStream since you're writing to memory only.

注意,你不能缓冲已经缓冲了的流.一些像xml解析器的组件,它们可能已经做了自己的缓冲,致你的代码"变慢".若你使用CopyUtils或IOUtils 你不需要添加缓冲流在已使用的代码中.经常查看JavaDoc获取信息.另一个情况下,缓冲是不必要的.比如内存流,因为你只是写到内存中.

翻译apache的Common IO用户向导和最佳实践.

http://commons.apache.org/proper/commons-io/bestpractices.html

posted @ 2014-09-16 20:33 烫烫烫烫阅读(609) 评论(0) 收藏举报

刷新页面返回顶部

if