1 /* 2 * Copyright (C) 2014 The Android Open Source Project 3 * Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved. 4 * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. 5 * 6 * This code is free software; you can redistribute it and/or modify it 7 * under the terms of the GNU General Public License version 2 only, as 8 * published by the Free Software Foundation. Oracle designates this 9 * particular file as subject to the "Classpath" exception as provided 10 * by Oracle in the LICENSE file that accompanied this code. 11 * 12 * This code is distributed in the hope that it will be useful, but WITHOUT 13 * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 14 * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License 15 * version 2 for more details (a copy is included in the LICENSE file that 16 * accompanied this code). 17 * 18 * You should have received a copy of the GNU General Public License version 19 * 2 along with this work; if not, write to the Free Software Foundation, 20 * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. 21 * 22 * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA 23 * or visit www.oracle.com if you need additional information or have any 24 * questions. 25 */ 26 27 // -- This file was mechanically generated: Do not edit! -- // 28 29 package java.nio.charset; 30 31 import java.nio.Buffer; 32 import java.nio.ByteBuffer; 33 import java.nio.CharBuffer; 34 import java.nio.BufferOverflowException; 35 import java.nio.BufferUnderflowException; 36 import java.lang.ref.WeakReference; 37 import java.nio.charset.CoderMalfunctionError; // javadoc 38 import java.util.Arrays; 39 40 41 /** 42 * An engine that can transform a sequence of sixteen-bit Unicode characters into a sequence of 43 * bytes in a specific charset. 44 * 45 * <a name="steps"></a> 46 * 47 * <p> The input character sequence is provided in a character buffer or a series 48 * of such buffers. The output byte sequence is written to a byte buffer 49 * or a series of such buffers. An encoder should always be used by making 50 * the following sequence of method invocations, hereinafter referred to as an 51 * <i>encoding operation</i>: 52 * 53 * <ol> 54 * 55 * <li><p> Reset the encoder via the {@link #reset reset} method, unless it 56 * has not been used before; </p></li> 57 * 58 * <li><p> Invoke the {@link #encode encode} method zero or more times, as 59 * long as additional input may be available, passing <tt>false</tt> for the 60 * <tt>endOfInput</tt> argument and filling the input buffer and flushing the 61 * output buffer between invocations; </p></li> 62 * 63 * <li><p> Invoke the {@link #encode encode} method one final time, passing 64 * <tt>true</tt> for the <tt>endOfInput</tt> argument; and then </p></li> 65 * 66 * <li><p> Invoke the {@link #flush flush} method so that the encoder can 67 * flush any internal state to the output buffer. </p></li> 68 * 69 * </ol> 70 * 71 * Each invocation of the {@link #encode encode} method will encode as many 72 * characters as possible from the input buffer, writing the resulting bytes 73 * to the output buffer. The {@link #encode encode} method returns when more 74 * input is required, when there is not enough room in the output buffer, or 75 * when an encoding error has occurred. In each case a {@link CoderResult} 76 * object is returned to describe the reason for termination. An invoker can 77 * examine this object and fill the input buffer, flush the output buffer, or 78 * attempt to recover from an encoding error, as appropriate, and try again. 79 * 80 * <a name="ce"></a> 81 * 82 * <p> There are two general types of encoding errors. If the input character 83 * sequence is not a legal sixteen-bit Unicode sequence then the input is considered <i>malformed</i>. If 84 * the input character sequence is legal but cannot be mapped to a valid 85 * byte sequence in the given charset then an <i>unmappable character</i> has been encountered. 86 * 87 * <a name="cae"></a> 88 * 89 * <p> How an encoding error is handled depends upon the action requested for 90 * that type of error, which is described by an instance of the {@link 91 * CodingErrorAction} class. The possible error actions are to {@linkplain 92 * CodingErrorAction#IGNORE ignore} the erroneous input, {@linkplain 93 * CodingErrorAction#REPORT report} the error to the invoker via 94 * the returned {@link CoderResult} object, or {@linkplain CodingErrorAction#REPLACE 95 * replace} the erroneous input with the current value of the 96 * replacement byte array. The replacement 97 * 98 99 * is initially set to the encoder's default replacement, which often 100 * (but not always) has the initial value <tt>{</tt> <tt>(byte)'?'</tt> <tt>}</tt>; 101 102 103 104 105 * 106 * its value may be changed via the {@link #replaceWith(byte[]) 107 * replaceWith} method. 108 * 109 * <p> The default action for malformed-input and unmappable-character errors 110 * is to {@linkplain CodingErrorAction#REPORT report} them. The 111 * malformed-input error action may be changed via the {@link 112 * #onMalformedInput(CodingErrorAction) onMalformedInput} method; the 113 * unmappable-character action may be changed via the {@link 114 * #onUnmappableCharacter(CodingErrorAction) onUnmappableCharacter} method. 115 * 116 * <p> This class is designed to handle many of the details of the encoding 117 * process, including the implementation of error actions. An encoder for a 118 * specific charset, which is a concrete subclass of this class, need only 119 * implement the abstract {@link #encodeLoop encodeLoop} method, which 120 * encapsulates the basic encoding loop. A subclass that maintains internal 121 * state should, additionally, override the {@link #implFlush implFlush} and 122 * {@link #implReset implReset} methods. 123 * 124 * <p> Instances of this class are not safe for use by multiple concurrent 125 * threads. </p> 126 * 127 * 128 * @author Mark Reinhold 129 * @author JSR-51 Expert Group 130 * @since 1.4 131 * 132 * @see ByteBuffer 133 * @see CharBuffer 134 * @see Charset 135 * @see CharsetDecoder 136 */ 137 138 public abstract class CharsetEncoder { 139 140 private final Charset charset; 141 private final float averageBytesPerChar; 142 private final float maxBytesPerChar; 143 144 private byte[] replacement; 145 private CodingErrorAction malformedInputAction 146 = CodingErrorAction.REPORT; 147 private CodingErrorAction unmappableCharacterAction 148 = CodingErrorAction.REPORT; 149 150 // Internal states 151 // 152 private static final int ST_RESET = 0; 153 private static final int ST_CODING = 1; 154 private static final int ST_END = 2; 155 private static final int ST_FLUSHED = 3; 156 157 private int state = ST_RESET; 158 159 private static String stateNames[] 160 = { "RESET", "CODING", "CODING_END", "FLUSHED" }; 161 162 163 /** 164 * Initializes a new encoder. The new encoder will have the given 165 * bytes-per-char and replacement values. 166 * 167 * @param cs 168 * The charset that created this encoder 169 * 170 * @param averageBytesPerChar 171 * A positive float value indicating the expected number of 172 * bytes that will be produced for each input character 173 * 174 * @param maxBytesPerChar 175 * A positive float value indicating the maximum number of 176 * bytes that will be produced for each input character 177 * 178 * @param replacement 179 * The initial replacement; must not be <tt>null</tt>, must have 180 * non-zero length, must not be longer than maxBytesPerChar, 181 * and must be {@linkplain #isLegalReplacement legal} 182 * 183 * @throws IllegalArgumentException 184 * If the preconditions on the parameters do not hold 185 */ 186 protected CharsetEncoder(Charset cs, float averageBytesPerChar, float maxBytesPerChar, byte[] replacement)187 CharsetEncoder(Charset cs, 188 float averageBytesPerChar, 189 float maxBytesPerChar, 190 byte[] replacement) 191 { 192 // BEGIN Android-added: A hidden constructor for the CharsetEncoderICU subclass. 193 this(cs, averageBytesPerChar, maxBytesPerChar, replacement, false); 194 } 195 196 /** 197 * This constructor is for subclasses to specify whether {@code replacement} can be used as it 198 * is ("trusted"). If it is trusted, {@link #replaceWith(byte[])} and 199 * {@link #implReplaceWith(byte[])} will not be called. 200 * @hide 201 */ CharsetEncoder(Charset cs, float averageBytesPerChar, float maxBytesPerChar, byte[] replacement, boolean trusted)202 protected CharsetEncoder(Charset cs, float averageBytesPerChar, float maxBytesPerChar, byte[] replacement, 203 boolean trusted) 204 { 205 // END Android-added: A hidden constructor for the CharsetEncoderICU subclass. 206 this.charset = cs; 207 if (averageBytesPerChar <= 0.0f) 208 throw new IllegalArgumentException("Non-positive " 209 + "averageBytesPerChar"); 210 if (maxBytesPerChar <= 0.0f) 211 throw new IllegalArgumentException("Non-positive " 212 + "maxBytesPerChar"); 213 if (!Charset.atBugLevel("1.4")) { 214 if (averageBytesPerChar > maxBytesPerChar) 215 throw new IllegalArgumentException("averageBytesPerChar" 216 + " exceeds " 217 + "maxBytesPerChar"); 218 } 219 this.replacement = replacement; 220 this.averageBytesPerChar = averageBytesPerChar; 221 this.maxBytesPerChar = maxBytesPerChar; 222 // BEGIN Android-changed: Avoid calling replaceWith() for trusted subclasses. 223 // replaceWith(replacement); 224 if (!trusted) { 225 replaceWith(replacement); 226 } 227 // END Android-changed: Avoid calling replaceWith() for trusted subclasses. 228 } 229 230 /** 231 * Initializes a new encoder. The new encoder will have the given 232 * bytes-per-char values and its replacement will be the 233 * byte array <tt>{</tt> <tt>(byte)'?'</tt> <tt>}</tt>. 234 * 235 * @param cs 236 * The charset that created this encoder 237 * 238 * @param averageBytesPerChar 239 * A positive float value indicating the expected number of 240 * bytes that will be produced for each input character 241 * 242 * @param maxBytesPerChar 243 * A positive float value indicating the maximum number of 244 * bytes that will be produced for each input character 245 * 246 * @throws IllegalArgumentException 247 * If the preconditions on the parameters do not hold 248 */ CharsetEncoder(Charset cs, float averageBytesPerChar, float maxBytesPerChar)249 protected CharsetEncoder(Charset cs, 250 float averageBytesPerChar, 251 float maxBytesPerChar) 252 { 253 this(cs, 254 averageBytesPerChar, maxBytesPerChar, 255 new byte[] { (byte)'?' }); 256 } 257 258 /** 259 * Returns the charset that created this encoder. 260 * 261 * @return This encoder's charset 262 */ charset()263 public final Charset charset() { 264 return charset; 265 } 266 267 /** 268 * Returns this encoder's replacement value. 269 * 270 * @return This encoder's current replacement, 271 * which is never <tt>null</tt> and is never empty 272 */ replacement()273 public final byte[] replacement() { 274 275 276 277 278 return Arrays.copyOf(replacement, replacement.length); 279 280 } 281 282 /** 283 * Changes this encoder's replacement value. 284 * 285 * <p> This method invokes the {@link #implReplaceWith implReplaceWith} 286 * method, passing the new replacement, after checking that the new 287 * replacement is acceptable. </p> 288 * 289 * @param newReplacement The replacement value 290 * 291 292 293 294 295 296 * The new replacement; must not be <tt>null</tt>, must have 297 * non-zero length, must not be longer than the value returned by 298 * the {@link #maxBytesPerChar() maxBytesPerChar} method, and 299 * must be {@link #isLegalReplacement legal} 300 301 * 302 * @return This encoder 303 * 304 * @throws IllegalArgumentException 305 * If the preconditions on the parameter do not hold 306 */ replaceWith(byte[] newReplacement)307 public final CharsetEncoder replaceWith(byte[] newReplacement) { 308 if (newReplacement == null) 309 throw new IllegalArgumentException("Null replacement"); 310 int len = newReplacement.length; 311 if (len == 0) 312 throw new IllegalArgumentException("Empty replacement"); 313 if (len > maxBytesPerChar) 314 throw new IllegalArgumentException("Replacement too long"); 315 316 317 318 319 if (!isLegalReplacement(newReplacement)) 320 throw new IllegalArgumentException("Illegal replacement"); 321 this.replacement = Arrays.copyOf(newReplacement, newReplacement.length); 322 323 implReplaceWith(this.replacement); 324 return this; 325 } 326 327 /** 328 * Reports a change to this encoder's replacement value. 329 * 330 * <p> The default implementation of this method does nothing. This method 331 * should be overridden by encoders that require notification of changes to 332 * the replacement. </p> 333 * 334 * @param newReplacement The replacement value 335 */ implReplaceWith(byte[] newReplacement)336 protected void implReplaceWith(byte[] newReplacement) { 337 } 338 339 340 341 private WeakReference<CharsetDecoder> cachedDecoder = null; 342 343 /** 344 * Tells whether or not the given byte array is a legal replacement value 345 * for this encoder. 346 * 347 * <p> A replacement is legal if, and only if, it is a legal sequence of 348 * bytes in this encoder's charset; that is, it must be possible to decode 349 * the replacement into one or more sixteen-bit Unicode characters. 350 * 351 * <p> The default implementation of this method is not very efficient; it 352 * should generally be overridden to improve performance. </p> 353 * 354 * @param repl The byte array to be tested 355 * 356 * @return <tt>true</tt> if, and only if, the given byte array 357 * is a legal replacement value for this encoder 358 */ isLegalReplacement(byte[] repl)359 public boolean isLegalReplacement(byte[] repl) { 360 WeakReference<CharsetDecoder> wr = cachedDecoder; 361 CharsetDecoder dec = null; 362 if ((wr == null) || ((dec = wr.get()) == null)) { 363 dec = charset().newDecoder(); 364 dec.onMalformedInput(CodingErrorAction.REPORT); 365 dec.onUnmappableCharacter(CodingErrorAction.REPORT); 366 cachedDecoder = new WeakReference<CharsetDecoder>(dec); 367 } else { 368 dec.reset(); 369 } 370 ByteBuffer bb = ByteBuffer.wrap(repl); 371 CharBuffer cb = CharBuffer.allocate((int)(bb.remaining() 372 * dec.maxCharsPerByte())); 373 CoderResult cr = dec.decode(bb, cb, true); 374 return !cr.isError(); 375 } 376 377 378 379 /** 380 * Returns this encoder's current action for malformed-input errors. 381 * 382 * @return The current malformed-input action, which is never <tt>null</tt> 383 */ malformedInputAction()384 public CodingErrorAction malformedInputAction() { 385 return malformedInputAction; 386 } 387 388 /** 389 * Changes this encoder's action for malformed-input errors. 390 * 391 * <p> This method invokes the {@link #implOnMalformedInput 392 * implOnMalformedInput} method, passing the new action. </p> 393 * 394 * @param newAction The new action; must not be <tt>null</tt> 395 * 396 * @return This encoder 397 * 398 * @throws IllegalArgumentException 399 * If the precondition on the parameter does not hold 400 */ onMalformedInput(CodingErrorAction newAction)401 public final CharsetEncoder onMalformedInput(CodingErrorAction newAction) { 402 if (newAction == null) 403 throw new IllegalArgumentException("Null action"); 404 malformedInputAction = newAction; 405 implOnMalformedInput(newAction); 406 return this; 407 } 408 409 /** 410 * Reports a change to this encoder's malformed-input action. 411 * 412 * <p> The default implementation of this method does nothing. This method 413 * should be overridden by encoders that require notification of changes to 414 * the malformed-input action. </p> 415 * 416 * @param newAction The new action 417 */ implOnMalformedInput(CodingErrorAction newAction)418 protected void implOnMalformedInput(CodingErrorAction newAction) { } 419 420 /** 421 * Returns this encoder's current action for unmappable-character errors. 422 * 423 * @return The current unmappable-character action, which is never 424 * <tt>null</tt> 425 */ unmappableCharacterAction()426 public CodingErrorAction unmappableCharacterAction() { 427 return unmappableCharacterAction; 428 } 429 430 /** 431 * Changes this encoder's action for unmappable-character errors. 432 * 433 * <p> This method invokes the {@link #implOnUnmappableCharacter 434 * implOnUnmappableCharacter} method, passing the new action. </p> 435 * 436 * @param newAction The new action; must not be <tt>null</tt> 437 * 438 * @return This encoder 439 * 440 * @throws IllegalArgumentException 441 * If the precondition on the parameter does not hold 442 */ onUnmappableCharacter(CodingErrorAction newAction)443 public final CharsetEncoder onUnmappableCharacter(CodingErrorAction 444 newAction) 445 { 446 if (newAction == null) 447 throw new IllegalArgumentException("Null action"); 448 unmappableCharacterAction = newAction; 449 implOnUnmappableCharacter(newAction); 450 return this; 451 } 452 453 /** 454 * Reports a change to this encoder's unmappable-character action. 455 * 456 * <p> The default implementation of this method does nothing. This method 457 * should be overridden by encoders that require notification of changes to 458 * the unmappable-character action. </p> 459 * 460 * @param newAction The new action 461 */ implOnUnmappableCharacter(CodingErrorAction newAction)462 protected void implOnUnmappableCharacter(CodingErrorAction newAction) { } 463 464 /** 465 * Returns the average number of bytes that will be produced for each 466 * character of input. This heuristic value may be used to estimate the size 467 * of the output buffer required for a given input sequence. 468 * 469 * @return The average number of bytes produced 470 * per character of input 471 */ averageBytesPerChar()472 public final float averageBytesPerChar() { 473 return averageBytesPerChar; 474 } 475 476 /** 477 * Returns the maximum number of bytes that will be produced for each 478 * character of input. This value may be used to compute the worst-case size 479 * of the output buffer required for a given input sequence. 480 * 481 * @return The maximum number of bytes that will be produced per 482 * character of input 483 */ maxBytesPerChar()484 public final float maxBytesPerChar() { 485 return maxBytesPerChar; 486 } 487 488 /** 489 * Encodes as many characters as possible from the given input buffer, 490 * writing the results to the given output buffer. 491 * 492 * <p> The buffers are read from, and written to, starting at their current 493 * positions. At most {@link Buffer#remaining in.remaining()} characters 494 * will be read and at most {@link Buffer#remaining out.remaining()} 495 * bytes will be written. The buffers' positions will be advanced to 496 * reflect the characters read and the bytes written, but their marks and 497 * limits will not be modified. 498 * 499 * <p> In addition to reading characters from the input buffer and writing 500 * bytes to the output buffer, this method returns a {@link CoderResult} 501 * object to describe its reason for termination: 502 * 503 * <ul> 504 * 505 * <li><p> {@link CoderResult#UNDERFLOW} indicates that as much of the 506 * input buffer as possible has been encoded. If there is no further 507 * input then the invoker can proceed to the next step of the 508 * <a href="#steps">encoding operation</a>. Otherwise this method 509 * should be invoked again with further input. </p></li> 510 * 511 * <li><p> {@link CoderResult#OVERFLOW} indicates that there is 512 * insufficient space in the output buffer to encode any more characters. 513 * This method should be invoked again with an output buffer that has 514 * more {@linkplain Buffer#remaining remaining} bytes. This is 515 * typically done by draining any encoded bytes from the output 516 * buffer. </p></li> 517 * 518 * <li><p> A {@linkplain CoderResult#malformedForLength 519 * malformed-input} result indicates that a malformed-input 520 * error has been detected. The malformed characters begin at the input 521 * buffer's (possibly incremented) position; the number of malformed 522 * characters may be determined by invoking the result object's {@link 523 * CoderResult#length() length} method. This case applies only if the 524 * {@linkplain #onMalformedInput malformed action} of this encoder 525 * is {@link CodingErrorAction#REPORT}; otherwise the malformed input 526 * will be ignored or replaced, as requested. </p></li> 527 * 528 * <li><p> An {@linkplain CoderResult#unmappableForLength 529 * unmappable-character} result indicates that an 530 * unmappable-character error has been detected. The characters that 531 * encode the unmappable character begin at the input buffer's (possibly 532 * incremented) position; the number of such characters may be determined 533 * by invoking the result object's {@link CoderResult#length() length} 534 * method. This case applies only if the {@linkplain #onUnmappableCharacter 535 * unmappable action} of this encoder is {@link 536 * CodingErrorAction#REPORT}; otherwise the unmappable character will be 537 * ignored or replaced, as requested. </p></li> 538 * 539 * </ul> 540 * 541 * In any case, if this method is to be reinvoked in the same encoding 542 * operation then care should be taken to preserve any characters remaining 543 * in the input buffer so that they are available to the next invocation. 544 * 545 * <p> The <tt>endOfInput</tt> parameter advises this method as to whether 546 * the invoker can provide further input beyond that contained in the given 547 * input buffer. If there is a possibility of providing additional input 548 * then the invoker should pass <tt>false</tt> for this parameter; if there 549 * is no possibility of providing further input then the invoker should 550 * pass <tt>true</tt>. It is not erroneous, and in fact it is quite 551 * common, to pass <tt>false</tt> in one invocation and later discover that 552 * no further input was actually available. It is critical, however, that 553 * the final invocation of this method in a sequence of invocations always 554 * pass <tt>true</tt> so that any remaining unencoded input will be treated 555 * as being malformed. 556 * 557 * <p> This method works by invoking the {@link #encodeLoop encodeLoop} 558 * method, interpreting its results, handling error conditions, and 559 * reinvoking it as necessary. </p> 560 * 561 * 562 * @param in 563 * The input character buffer 564 * 565 * @param out 566 * The output byte buffer 567 * 568 * @param endOfInput 569 * <tt>true</tt> if, and only if, the invoker can provide no 570 * additional input characters beyond those in the given buffer 571 * 572 * @return A coder-result object describing the reason for termination 573 * 574 * @throws IllegalStateException 575 * If an encoding operation is already in progress and the previous 576 * step was an invocation neither of the {@link #reset reset} 577 * method, nor of this method with a value of <tt>false</tt> for 578 * the <tt>endOfInput</tt> parameter, nor of this method with a 579 * value of <tt>true</tt> for the <tt>endOfInput</tt> parameter 580 * but a return value indicating an incomplete encoding operation 581 * 582 * @throws CoderMalfunctionError 583 * If an invocation of the encodeLoop method threw 584 * an unexpected exception 585 */ encode(CharBuffer in, ByteBuffer out, boolean endOfInput)586 public final CoderResult encode(CharBuffer in, ByteBuffer out, 587 boolean endOfInput) 588 { 589 int newState = endOfInput ? ST_END : ST_CODING; 590 if ((state != ST_RESET) && (state != ST_CODING) 591 && !(endOfInput && (state == ST_END))) 592 throwIllegalStateException(state, newState); 593 state = newState; 594 595 for (;;) { 596 597 CoderResult cr; 598 try { 599 cr = encodeLoop(in, out); 600 } catch (BufferUnderflowException x) { 601 throw new CoderMalfunctionError(x); 602 } catch (BufferOverflowException x) { 603 throw new CoderMalfunctionError(x); 604 } 605 606 if (cr.isOverflow()) 607 return cr; 608 609 if (cr.isUnderflow()) { 610 if (endOfInput && in.hasRemaining()) { 611 cr = CoderResult.malformedForLength(in.remaining()); 612 // Fall through to malformed-input case 613 } else { 614 return cr; 615 } 616 } 617 618 CodingErrorAction action = null; 619 if (cr.isMalformed()) 620 action = malformedInputAction; 621 else if (cr.isUnmappable()) 622 action = unmappableCharacterAction; 623 else 624 assert false : cr.toString(); 625 626 if (action == CodingErrorAction.REPORT) 627 return cr; 628 629 if (action == CodingErrorAction.REPLACE) { 630 if (out.remaining() < replacement.length) 631 return CoderResult.OVERFLOW; 632 out.put(replacement); 633 } 634 635 if ((action == CodingErrorAction.IGNORE) 636 || (action == CodingErrorAction.REPLACE)) { 637 // Skip erroneous input either way 638 in.position(in.position() + cr.length()); 639 continue; 640 } 641 642 assert false; 643 } 644 645 } 646 647 /** 648 * Flushes this encoder. 649 * 650 * <p> Some encoders maintain internal state and may need to write some 651 * final bytes to the output buffer once the overall input sequence has 652 * been read. 653 * 654 * <p> Any additional output is written to the output buffer beginning at 655 * its current position. At most {@link Buffer#remaining out.remaining()} 656 * bytes will be written. The buffer's position will be advanced 657 * appropriately, but its mark and limit will not be modified. 658 * 659 * <p> If this method completes successfully then it returns {@link 660 * CoderResult#UNDERFLOW}. If there is insufficient room in the output 661 * buffer then it returns {@link CoderResult#OVERFLOW}. If this happens 662 * then this method must be invoked again, with an output buffer that has 663 * more room, in order to complete the current <a href="#steps">encoding 664 * operation</a>. 665 * 666 * <p> If this encoder has already been flushed then invoking this method 667 * has no effect. 668 * 669 * <p> This method invokes the {@link #implFlush implFlush} method to 670 * perform the actual flushing operation. </p> 671 * 672 * @param out 673 * The output byte buffer 674 * 675 * @return A coder-result object, either {@link CoderResult#UNDERFLOW} or 676 * {@link CoderResult#OVERFLOW} 677 * 678 * @throws IllegalStateException 679 * If the previous step of the current encoding operation was an 680 * invocation neither of the {@link #flush flush} method nor of 681 * the three-argument {@link 682 * #encode(CharBuffer,ByteBuffer,boolean) encode} method 683 * with a value of <tt>true</tt> for the <tt>endOfInput</tt> 684 * parameter 685 */ flush(ByteBuffer out)686 public final CoderResult flush(ByteBuffer out) { 687 if (state == ST_END) { 688 CoderResult cr = implFlush(out); 689 if (cr.isUnderflow()) 690 state = ST_FLUSHED; 691 return cr; 692 } 693 694 if (state != ST_FLUSHED) 695 throwIllegalStateException(state, ST_FLUSHED); 696 697 return CoderResult.UNDERFLOW; // Already flushed 698 } 699 700 /** 701 * Flushes this encoder. 702 * 703 * <p> The default implementation of this method does nothing, and always 704 * returns {@link CoderResult#UNDERFLOW}. This method should be overridden 705 * by encoders that may need to write final bytes to the output buffer 706 * once the entire input sequence has been read. </p> 707 * 708 * @param out 709 * The output byte buffer 710 * 711 * @return A coder-result object, either {@link CoderResult#UNDERFLOW} or 712 * {@link CoderResult#OVERFLOW} 713 */ implFlush(ByteBuffer out)714 protected CoderResult implFlush(ByteBuffer out) { 715 return CoderResult.UNDERFLOW; 716 } 717 718 /** 719 * Resets this encoder, clearing any internal state. 720 * 721 * <p> This method resets charset-independent state and also invokes the 722 * {@link #implReset() implReset} method in order to perform any 723 * charset-specific reset actions. </p> 724 * 725 * @return This encoder 726 * 727 */ reset()728 public final CharsetEncoder reset() { 729 implReset(); 730 state = ST_RESET; 731 return this; 732 } 733 734 /** 735 * Resets this encoder, clearing any charset-specific internal state. 736 * 737 * <p> The default implementation of this method does nothing. This method 738 * should be overridden by encoders that maintain internal state. </p> 739 */ implReset()740 protected void implReset() { } 741 742 /** 743 * Encodes one or more characters into one or more bytes. 744 * 745 * <p> This method encapsulates the basic encoding loop, encoding as many 746 * characters as possible until it either runs out of input, runs out of room 747 * in the output buffer, or encounters an encoding error. This method is 748 * invoked by the {@link #encode encode} method, which handles result 749 * interpretation and error recovery. 750 * 751 * <p> The buffers are read from, and written to, starting at their current 752 * positions. At most {@link Buffer#remaining in.remaining()} characters 753 * will be read, and at most {@link Buffer#remaining out.remaining()} 754 * bytes will be written. The buffers' positions will be advanced to 755 * reflect the characters read and the bytes written, but their marks and 756 * limits will not be modified. 757 * 758 * <p> This method returns a {@link CoderResult} object to describe its 759 * reason for termination, in the same manner as the {@link #encode encode} 760 * method. Most implementations of this method will handle encoding errors 761 * by returning an appropriate result object for interpretation by the 762 * {@link #encode encode} method. An optimized implementation may instead 763 * examine the relevant error action and implement that action itself. 764 * 765 * <p> An implementation of this method may perform arbitrary lookahead by 766 * returning {@link CoderResult#UNDERFLOW} until it receives sufficient 767 * input. </p> 768 * 769 * @param in 770 * The input character buffer 771 * 772 * @param out 773 * The output byte buffer 774 * 775 * @return A coder-result object describing the reason for termination 776 */ encodeLoop(CharBuffer in, ByteBuffer out)777 protected abstract CoderResult encodeLoop(CharBuffer in, 778 ByteBuffer out); 779 780 /** 781 * Convenience method that encodes the remaining content of a single input 782 * character buffer into a newly-allocated byte buffer. 783 * 784 * <p> This method implements an entire <a href="#steps">encoding 785 * operation</a>; that is, it resets this encoder, then it encodes the 786 * characters in the given character buffer, and finally it flushes this 787 * encoder. This method should therefore not be invoked if an encoding 788 * operation is already in progress. </p> 789 * 790 * @param in 791 * The input character buffer 792 * 793 * @return A newly-allocated byte buffer containing the result of the 794 * encoding operation. The buffer's position will be zero and its 795 * limit will follow the last byte written. 796 * 797 * @throws IllegalStateException 798 * If an encoding operation is already in progress 799 * 800 * @throws MalformedInputException 801 * If the character sequence starting at the input buffer's current 802 * position is not a legal sixteen-bit Unicode sequence and the current malformed-input action 803 * is {@link CodingErrorAction#REPORT} 804 * 805 * @throws UnmappableCharacterException 806 * If the character sequence starting at the input buffer's current 807 * position cannot be mapped to an equivalent byte sequence and 808 * the current unmappable-character action is {@link 809 * CodingErrorAction#REPORT} 810 */ encode(CharBuffer in)811 public final ByteBuffer encode(CharBuffer in) 812 throws CharacterCodingException 813 { 814 int n = (int)(in.remaining() * averageBytesPerChar()); 815 ByteBuffer out = ByteBuffer.allocate(n); 816 817 if ((n == 0) && (in.remaining() == 0)) 818 return out; 819 reset(); 820 for (;;) { 821 CoderResult cr = in.hasRemaining() ? 822 encode(in, out, true) : CoderResult.UNDERFLOW; 823 if (cr.isUnderflow()) 824 cr = flush(out); 825 826 if (cr.isUnderflow()) 827 break; 828 if (cr.isOverflow()) { 829 n = 2*n + 1; // Ensure progress; n might be 0! 830 ByteBuffer o = ByteBuffer.allocate(n); 831 out.flip(); 832 o.put(out); 833 out = o; 834 continue; 835 } 836 cr.throwException(); 837 } 838 out.flip(); 839 return out; 840 } 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 canEncode(CharBuffer cb)920 private boolean canEncode(CharBuffer cb) { 921 if (state == ST_FLUSHED) 922 reset(); 923 else if (state != ST_RESET) 924 throwIllegalStateException(state, ST_CODING); 925 926 // BEGIN Android-added: Fast path handling for empty buffers. 927 // Empty buffers can always be "encoded". 928 if (!cb.hasRemaining()) { 929 return true; 930 } 931 // END Android-added: Fast path handling for empty buffers. 932 933 CodingErrorAction ma = malformedInputAction(); 934 CodingErrorAction ua = unmappableCharacterAction(); 935 try { 936 onMalformedInput(CodingErrorAction.REPORT); 937 onUnmappableCharacter(CodingErrorAction.REPORT); 938 encode(cb); 939 } catch (CharacterCodingException x) { 940 return false; 941 } finally { 942 onMalformedInput(ma); 943 onUnmappableCharacter(ua); 944 reset(); 945 } 946 return true; 947 } 948 949 /** 950 * Tells whether or not this encoder can encode the given character. 951 * 952 * <p> This method returns <tt>false</tt> if the given character is a 953 * surrogate character; such characters can be interpreted only when they 954 * are members of a pair consisting of a high surrogate followed by a low 955 * surrogate. The {@link #canEncode(java.lang.CharSequence) 956 * canEncode(CharSequence)} method may be used to test whether or not a 957 * character sequence can be encoded. 958 * 959 * <p> This method may modify this encoder's state; it should therefore not 960 * be invoked if an <a href="#steps">encoding operation</a> is already in 961 * progress. 962 * 963 * <p> The default implementation of this method is not very efficient; it 964 * should generally be overridden to improve performance. </p> 965 * 966 * @param c 967 * The given character 968 * 969 * @return <tt>true</tt> if, and only if, this encoder can encode 970 * the given character 971 * 972 * @throws IllegalStateException 973 * If an encoding operation is already in progress 974 */ canEncode(char c)975 public boolean canEncode(char c) { 976 CharBuffer cb = CharBuffer.allocate(1); 977 cb.put(c); 978 cb.flip(); 979 return canEncode(cb); 980 } 981 982 /** 983 * Tells whether or not this encoder can encode the given character 984 * sequence. 985 * 986 * <p> If this method returns <tt>false</tt> for a particular character 987 * sequence then more information about why the sequence cannot be encoded 988 * may be obtained by performing a full <a href="#steps">encoding 989 * operation</a>. 990 * 991 * <p> This method may modify this encoder's state; it should therefore not 992 * be invoked if an encoding operation is already in progress. 993 * 994 * <p> The default implementation of this method is not very efficient; it 995 * should generally be overridden to improve performance. </p> 996 * 997 * @param cs 998 * The given character sequence 999 * 1000 * @return <tt>true</tt> if, and only if, this encoder can encode 1001 * the given character without throwing any exceptions and without 1002 * performing any replacements 1003 * 1004 * @throws IllegalStateException 1005 * If an encoding operation is already in progress 1006 */ canEncode(CharSequence cs)1007 public boolean canEncode(CharSequence cs) { 1008 CharBuffer cb; 1009 if (cs instanceof CharBuffer) 1010 cb = ((CharBuffer)cs).duplicate(); 1011 else 1012 // Android-removed: An unnecessary call to toString(). 1013 // cb = CharBuffer.wrap(cs.toString()); 1014 cb = CharBuffer.wrap(cs); 1015 return canEncode(cb); 1016 } 1017 1018 1019 1020 throwIllegalStateException(int from, int to)1021 private void throwIllegalStateException(int from, int to) { 1022 throw new IllegalStateException("Current state = " + stateNames[from] 1023 + ", new state = " + stateNames[to]); 1024 } 1025 1026 } 1027